Solving Color Nonograms

Background
First abandoned algorithm
Second abandoned algorithm
First acceptable algorithm for the first step
Next step, assembling the rows
First acceptable algorithm for the second step
Second (sort of) acceptable algorithm for the second step
Cutting out the pikers
Cutting out yet more pikers
An exercise for the reader
The source code
Comparisons between languages
Optimizations done to the scripts

Background

Some time ago I wrote a Java applet that would let people play nonograms (which were called 'Paint-by-Number' puzzles when I first met them). I then had to create a bunch of puzzles for people to solve. I made extensive use of Steven Simpson's nonogram program to make sure that these puzzles had unique solutions.

Recently, I revisited that my old applet and revised it to deal with color nonograms (that is, puzzles with colors other than simply black and white). Unfortunately, the nonogram program will not work with multiple colors. As a matter of fact, the .non format that it uses will not even admit to the possibility of the existence of such things.

So I decided to write my own. The first thing to do was to read in an .xpm file and figure out its row and columns keys would be. That was pretty simple. Next we had to forget what the picture looked like and solve it, using the keys that we'd generated. That turned out to be a bit harder.

First abandoned algorithm

Well, actually, I thought that it'd be simple. Simply run though every possible permutation of the colors in this image, and check each one to see if it matches the keys. The number of permutations should only be

numColors * numRows * numColumns

right?

Uh, wrong.

The actual number is

numColors(numRows * numColumns)

which is an entirely different kettle of fish.

I did write the code to do it, recursively looping through all rows, columns and colors, and successfully tested it with a little 2x3 test image, but when I tried it on this four-color 16x16 image...
ABC image
Well, let's see. 4(16 * 16) is .......1.340780793e+154.

Second abandoned algorithm

So next I decided to attack the individual rows and columns instead of the entire grid. I'd run through all possible permutations of each row, compare each against its key and select the ones that match. Then I'd try every combination of the matching rows and find the ones that match the column keys.

So, to start with, we need every permutation of a row of the image. That's 416, or, uh, 4.29497e+09.

I wrote this code up, just to see how bad it was, and it finished. It took several minutes, but it finished.

First acceptable algorithm for the first step

I took a step back, and realized that I wasn't going to solve this problem by crunching numbers - I was going to have to manipulate strings. So obviously I was going to have to switch from C to Perl, er, I mean, C++, and the STL string class.

So, after porting the code to read the xpm file and create the keys, I added a function that, for each key, calculated the number of white spaces needed to complete the row or column, counted the number of holes into which white spaces could be inserted, and then inserted every possible permutation of spaces into these holes. This code ran in a fraction of a second.

Next step, assembling the rows

Now that we've found all of the possible rows, all we have to do was to assemble every possible permutation of them and check each result. In the case of the ABC image, that's

15 * 105 * 105 * 13 * 105 * 286 * 105 * 105 * 14 * 105 * 286 * 286 * 16 * 16 * 105 * 14

or 2.1932685999e+26.

Ouch!

First acceptable algorithm for the second step

OK, this isn't going to work. So what do we try now?

What if we were to check the image as it's being built up? Every time we lay down a new row, we could make sure that the part of the image we have so far still matches the column keys. And if it doesn't, we don't have to check any of the permutations that start with this combination of rows.

This one works, mostly. Here's a graph showing the time it takes to solve 39 of the 40 16x16 puzzles that I had made up.

plot of first algorithm (edited)

That doesn't seem so bad, does it? One solution takes over a minute, but the rest are pretty reasonable.

OK, let me put the edited-out value back in...

plot of first algorithm (unedited)

Yep, you read that right. It's over five minutes. This is the offending image:

firecracker image

Second (sort of) acceptable algorithm for the second step

So what happens if we knock impossible possible columns out of the running the same way that we're already knocking out impossible possible rows? (Oh yeah, did I mention that I've also assembled a list of possible columns, and am using them to determine whether the partial image is still possible?)

We can't remove the possible columns from the main store, since we need to keep the originals for when we work our way back up the stack out of our recursions. But we can make a new copy for each new row, let it erase the columns that no longer fit and pass it on to the next row to be trimmed a bit more.

The results of this algorithm were interesting, but not altogether satisfying.

plot of both algorithms together

Note that the times for the two worst cases have been improved. But some of the other, previously reasonable, times have gotten very, very bad. It seems that for some images the extra overhead of making copies of the possible columns more than outweighs the savings of having to check fewer columns.

Cutting out the pikers

The next idea is to do some pre-processing, that is, to use the row keys to eliminate some of the possible columns, and visa versa.

So, for each row, we 'and' all of the possible permutations together. What we're doing is looking at all of a row's possible permutations and looking for squares that are the same color in each. That means that that square has to be that color. (This is the mechanical equivalent to a person looking at the puzzle and muttering, "Well, row #3 has a single black block 20 squares long, but there are only 16 squares in the row, so the middle 8 squares have to be black.")

When we've done this to all of the rows, we end up with a matrix with many squares of unknown color, but with some squares whose color we know. We can then check all of the possible columns and chuck the ones that don't match what we know know about the image. Then we can go on to solve the puzzle using one of the two algorithms that we've already got.

But before we do this, we might as well cut down on the possible rows as well, 'and'ing all of the possible columns together and throwing out the rows that don't match.

When I coded this up (using algorithm #1) I got these times...

plot of both algorithms, plus the first algorithm, pre-processed

Bingo!
Let's look at that again, using a log scale to see what's going on close to zero...

same as above, with a log scale

In nearly every instance, the extra processing to get rid of possible rows and columns pays off.

Let's compare the times of both algorithms when the data's been pre-processed.

plot of both algorithms, pre-processed

I think we have a winner.

Cutting out yet more pikers

So let's throw algorithm #2 out, since it's not saving us anything anymore.

But there's still one more thing we can try to speed things up. In the pre-processing step, we're using the row keys to eliminate some of the possible columns and the column keys to eliminate some of the possible rows, ending up with a subset of the original possible rows and columns. We can repeat that step, using the reduced set of possible rows to further reduce the set of possible columns, and visa versa. We can do this ad infinitum. Well not really, but we can do it until we reach a point where we can't eliminate any more rows or columns.

So, I changed the program to pre-process the puzzle until it won't pre-process anymore, and then went on to solve it using the first algorithm.

plot of first algorithm, repeatedly pre-processed

OK, that's good enough. I'm going to quit now.

But first I'll add a comment. It was interesting to note that when I switched to doing the pre-processing repeatedly, sometimes the puzzle actually got solved in the pre-processing step. I suspect, that for the original two-color nonograms, this pre-processing might always be enough to solve the puzzle.

The reason that it can't solve multi-colored nonograms is because it doesn't emulate the human reasoning completely. Where a person can think, "By the row key I see that this square can be either black or white, while by the column key it must be either white or red. Therefore, it must be white.", this program can only mark it as 'unknown'.

An exercise for the reader

Remember a little while ago, when I said that this program can't completely emulate human reasoning when solving multi-colored nonograms? Well, I'm thinking that it can be made to. All it would take would be to add some intelligence to the Image class, to be able to handle states like 'red or white', in addition to the current valid states of 'red', 'white', 'black', etc., or 'unknown color'.

Never mind, I went and did this myself. The pre-processing step now keeps track of multiplicities of colors when trimming possible rows and columns. Since this brings the pre-processing very nearly up to the level of what a human can do, I decided to print out the preliminary image in addition to any final ones.

The source code

Here's the final C code (compiled with gcc 3.2.2)

And here's another one written in Perl (v5.8.0). This one uses the same algorithm, but the performance is somewhat different. For 'simple' puzzles, in which there are only a few hundred combinations of possible rows and columns, the Perl program typically takes 3 to 4 times longer than the C++ program. But for more difficult puzzles, those having thousands of possible combinations, the difference in performance flips dramatically, with the Perl program doing in under a minute what would take the C++ program a half hour.

And here's one in Python (v2.2.2). It's using the same algorithm, and is as complete as the Perl script, and ends up being even faster. I think I may have just become a convert.

And just because I don't know when to stop, here's another one in Ruby (v1.6.8).

Comparisons between languages

Here are the final results for the two scripts (for the first set of 16x16 puzzles, along with another set of 20x20 puzzles), after I optimized the heck out of both of them:

plot of Perl and Python execution times (16x16 puzzles) plot of Perl and Python execution times (20x20 puzzles)

And just for comparison, here's the C++ results added in as well:

plot of C++, Perl and Python execution times (16x16 puzzles) plot of C++, Perl and Python execution times (20x20 puzzles)

In this application, the major advantage that both Perl and Python have over C++ is their regular expression support. And the advantage that Python has over Perl is its ability to split regular expression pattern matching in two - to compile the regular expression once, and then do the pattern matching many times.

And here are the results of the Perl, Python and Ruby scripts together:

plot of Perl, Python and Ruby execution times (16x16 puzzles) plot of Perl, Python and Ruby execution times (20x20 puzzles)

Ruby, like Python, has the capability to compile a regular execution once before matching it repeatedly. But try as I might, I still couldn't get the Ruby execution times down to the level of the Python ones. For the worst puzzles, I couldn't even get the Ruby script to be as fast as the Perl one. Now granted, I'd never looked at Ruby before (in other words, I'm a Ruby rube) so that might merely be because of my inexperience.

Optimizations done to the scripts

Perl Python
Used s/// to manipulate the strings. Used s/// to manipulate the strings.
Used substr() to extract one character from a string.
(faster than unpack(), m//, or split //)
Used list.reverse() instead of counting the index downwards
Used hashes to store the multi-color state.
(faster than strings or lists)
Used dictionaries to store the multi-color state.
(faster than strings or lists)
Changed function calls in loops to inline statements. Used deeply indexed data instead of convenience objects.
Find what fits in one function call.
Build the template once, and call grep() to find the fits.
Find what fits in one function call.
Compile the expression once, and match it many times.

Enjoy!
Ali Corbin
ali@blindchicken.com