Jump to content

A BitMask-based Sudoku Solver


RTFC
 Share

Recommended Posts

The Short Version:

This set of well-annotated example scripts shows how to solve sudoku puzzles with simple, powerful bitmask functions applied on a massive scale, demonstrating both common solving techniques and highly-optimised brute-force.

All you need is:

XfilterSmall.png.dfa58473b5f768c223bc4ed230754759.png the CrossFilter GUI

Edited by RTFC
update
Link to comment
Share on other sites

The Longer Version:

Cells, Interlinked.

At the most basic level, all binary digital computing reduces to logical operations on individual bits, that is, the switches in circuitry that can be either 1 or 0 ("on" or "off", "True" or "False"). A grouped set of multiple on/off states can be called a bitmask (or "bit vector"), and when we apply logic to change the state of one bitmask by one or more other bitmasks, we can evaluate, in a single function call, as many bit-states as the targeted bitmask contains (in this case: over 362 thousand, see below). Moreover, multiple same-sized bitmasks can be aligned in a rectangular grid, allowing matrix functions to manipulate all bitmasks simultaneously in a single call. A powerful set of simple tools for this purpose is provided by Eigen4AutoIt (or E4A for short).

To demonstrate the bitmask prowess of E4A, this example script shows three independent approaches to solving sudoku puzzles. Sudoku's are mathematically simple problems, yet their solution can be computationally hard. Given ten example puzzles (external .sdk files can also be imported, see the Remarks section in the script), the Solver approach implements a number of well-documented sudoku-solving techniques, with which to reduce the number of candidate values per cell. These work well for easy sudoku's, but get stuck (example sudoku #6), or fail completely in harder cases (examples #9 and #10).

By contrast, Brute-Force consists of six nested loops (Why six? See below!), and will identify either the first, or all valid solutions (depending on whether "early-out upon first hit" is enabled). It will never fail, but may take a long time for the hardest problems. For example, the (in)famous "Easter Monster" sudoku (example #9) takes this script over twenty minutes to solve, at least on my decrepit old laptop.:( But the main aim in writing it was clarity and legibility, not speed. Finally, the CrossFilter method demonstrates how neighbouring sets iteratively constrain one another further, whittling down the number of valid combinations from thousands to a few (with remaining invalids dispensed with by applying a few seconds of Brute Force). This final approach can run through an internal library over a hundred sudokus.

The Brute-Force and CrossFilter methods contain some features that I have not encountered in other such software, notably:

  • The chosen unit of permutation is the 3x3 sudoku block. One could alternatively have selected the 1x9 row, or the 9x1 column as foundation, and one possible extension would be to consider all three units in parallel, and select the most effective type of unit for each problem separately. But here we consider only blocks; we do not care a rat's behind about the content of individual cells. Cells suck.

 

  • Each sudoku consists of 9 blocks (from TL to BR, see left panel in diagram 1 on the download page), and each block has 9 factorial ("9!") permutations, meaning the 9 unique digits 1-9 can be arranged within a block in exactly 362,880 unique configurations (see diagram 1, right panel). In other words, in the first-selected empty cell, all 9 digits are possible to place, followed by 8 remaining in the next (as one digit is used up in the first cell), then 7 choices are left, then 6, 5, 4, 3, 2, and the final cell is predetermined. We store these on file in a 9 x 362,880 matrix table called "blockpermuts.mat" (provided). The algorithm to generate it is included in the script.

 

  • A valid sudoku solution contains 9 of these 3x3 blocks, in a 3x3 grid that additionally satisfies the constraints of having 9 unique digits (1-9) in each row and in each column. Note that the same 9 blocks can be re-arranged in 3! x 3! = 6 x 6 = 36 different ways, by swapping block rows and/or block columns. So the same solution in terms of the 9 block permutation IDs (which is what we're after) can give rise to 36 different-looking sudoku's (we ignore rotations and mirroring here, because those operations would change the block IDs). This means we can ourselves swap block rows and cols of the presented puzzle as we see fit, without affecting the solution in terms of block IDs. Why is this important?

 

  • Any specific block imposes strong constraints on the validity of both neighbouring blocks in the same block row (horizontally in either direction), and in the same block column (vertically in either direction). For example, the left panel of diagram 2 shows the TL (top-left) block as unconstrained (362,880 permutations), but as soon as we choose any single one of these, the valid permutations of its immediate row neighbour (TC, see diagram 1) are reduced to 12,096, and likewise for its immediate col neighbour (CL). Even better, given chosen blocks for TC and CL as well, their respective next neighbours in the same block row (TR) and col (BL) have their valid permutations reduced (from over 362K) to a mere 216 each, because TL imposes additional constraints on them. The central block (CC) is more complicated, ranging from 384 to 448 valid permutations (depending on which TC and CL blocks are chosen), but the joint possibilities always yield a mean of 404. Best of all, given ANY six known blocks, the remaining three have their options reduced to either 0 (invalid configuration) or 1 (= sudoku solved).

 

  • Combining the previous two insights, we can optimise the blocks arrangement by swapping block rows and block cols, so that the two blocks with the highest number of remaining valid permutations (after processing the clues) end up on the block diagonal (here we use TL and BR, see diagram 2, right panel). Brute-Force then cycles through the (smaller-sized) valid permutations in the off-diagonal blocks at top-right (TC, TR, and CR) and at bottom-left (CL, BL, and BC). Inside the innermost loop, 6 out of 9 blocks are known, automatically defining the three diagonal blocks TL, CC, and BR without any further cycling or evaluation. We only need to check if the full solution is found. That is, we can completely ignore the two hardest blocks, of which we know least (plus a third on the diagonal that is selected automatically when we swap the second block to BR).

 

  • Brute-Force would still take a long time, if it wasn't for the properties of (in)valid block neighbours. Therefore, two provided digit tables ("digitInRows.mat" and "digitInCols.mat", see diagram 3) record as bitmasks the row c.q. col position inside a block of each digit (3 options x 9 digits = 27 rows), for all block permutations (362,880 cols). This is where the power of matrix operations really shines, because in two simple E4A function calls we can eliminate ALL invalid row/col neighbours of a block (see  diagram 4). Moreover, the resulting bitmasks can be stacked as multiple exclusion filters to further reduce the number of cycles in the nested loops (see diagram 4). So given a choice of TC, TR's options are reduced; given TR and CL, CR's options are reduced; and once BL is also selected, it constrains BC in tandem with TC. Most importantly, whenever any bitmask becomes zero for all bits (easily tested with E4A's _Eigen_IsZero function), we can exit the current nesting level prematurely and move on.


The final result is a succinct, highly-optimised algorithm that is (hopefully) easy to understand. It illustrates that you don't need a maths degree to get lots of mileage out of the E4A environment. It is not the fastest sudoku solver out there (that wouldn't be written in AutoIt to begin with), but it shows how bitmasks can be gainfully employed to tackle a variety of programming challenges. For some other applications, you can study (and maybe improve or extend) the various solver methods, which all apply bitmask logic as well. Have Fun!

And remember, kids: Cells Suck, but Blocks Rock!

Edited by RTFC
update
Link to comment
Share on other sites

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

@jchd: Blast from the past, thanks for that! Haven't looked at APL since my student days at uni. Still remember being impressed it could squeeze a complete least-squares solver into a single line of code. Borderline illegible though, especially after a few months break. ("Who wrote that crap? Oh wait, that would be me.") I can see why it never appealed to the masses, it's like RegExp on steroids.;)

Yes, I can see the similarity with bitmasks. But I think it's actually closer in functionality to Knuth's Algorithm X approach of the exact cover problem, which is of course unbeatable for speed (mere ms), but which would fail on any and all underdetermined sudokus, whereas brute-force would happily barrel through (and collect) all possible solutions. Still, kind of humbling to see APL squeeze a solution into fewer than 10 lines of code, when mine needs over a thousand. But I'd argue that my code is a bit easier on the eyes...

Edited by RTFC
clarification
Link to comment
Share on other sites

I confess not having heavily used APL since the IBM 5100 (geez, almost 40 years ago!) mostly because it's an unreadable list of bizare symbols just few weeks after you wrote it.

If you like both easy to use, super-fast and completely cryptic code for solving sudoku, then you'll love the SQL version!

The SQL setup is complicated, mostly because we have no way to declare variables, arrays and such in pure SQL, but as always it boils down to requesting what the conforming result(s) should satisfy without explicitely specifying how to achieve that goal.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Yeah, I saw that when I did a brief survey what other AutoIt-based sudoku solvers were already out there. I like "easy, to use," love "super-fast", but am not too keen on "completely cryptic code" (except in a CodeCrypter context of course:D). And I lost my virginity to an IBM 370 mainframe (she was cold and not very tender). I remember those days well; the world was still a lush paradise, golden sunsets, warm breezes, and then that damned asteroid hit and wiped most of us out, and now those pesky mammals rule the roost (those bipeds are the worst!).

Edited by RTFC
Link to comment
Share on other sites

  • 3 weeks later...

Version 1.1 is released, containing an additional stand-alone Brute-Force (BF) solver with an optimised algorithm and optional GUI.

Just couldn't shake a niggling suspicion:think: that the BF approach was still suboptimal, so I threw some heavier maths and stats at the problem, grinding through a sample of about a hundred sudoku's of varying complexity. Some interesting new features came to light, some of which have been incorporated into the new BF script; the BF approach in the original script remains untouched.

Link to comment
Share on other sites

Version 1.2 is released, with a new CrossFilter-based script (note: requires E4A v5.1) that drives a more colourful, mesmerising GUI (if you're very easily mesmerised, that is). That solver also contains an internal library of well over a hundred sudokus, from easy to extreme. Watching this is almost as soothing as staring at an aquarium, and at a fraction of the cost.

Link to comment
Share on other sites

Just to clarify, this thread is not actually about solving sudokus, but about how to solve a variety of programming challenges with bitmasks. I used sudokus because they're probably the best-known numerical puzzle around at the moment. But solving any sudoku in mere milliseconds is a solved problem (google: "dancing links," "exact cover," or see example code here, here, or here), and writing an AutoIt version thereof is not on my to-do list.

Link to comment
Share on other sites

 

11 minutes ago, RTFC said:

Just to clarify, this thread is not actually about solving sudokus, *snip * But solving any sudoku in mere milliseconds is a solved problem 

Solving a (solved) problem with a (not really suitable) programming language is something "for fun".

But to solve a problem with an unusual approach is exciting and "expands the horizon". Thank you for that!

 

OT:

Nowadays, working with Terabyte-sized programming languages to solve the simplest problems with Titanic-sized programs running on gigaherz-powered, kilowatts burning Multicore-Computers, is NORMAL! Why? Because we can!

Solving a Sudoku-puzzle is possible with a 62 BYTES (yes Bytes) sized program within Milliseconds. No compiler is necessary, no frameworks, no libraries, no overpowered Hardware. Just to mention.

Solving a problem definitely does not depend on modern tools.

 

On 6/11/2020 at 11:46 AM, RTFC said:

And remember, kids: Cells Suck, but Blocks Rock!

And Brain rulez!😉

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...