Six Simple Sudoku Solvers I: Python (Reference)

A few weeks ago, I saw a random post on LinkedIn in which someone showed off a simple Sudoku solver written in C++, and I realized that I’ve never attempted to write one myself. Instead of even really reading the post, I got to work, and implemented a simple solver. It solved Sudokus, but would get stuck on harder puzzles, so I implemented branching and backtracking, and now here I am, two weeks later, having written six different Sudoku solvers in six different programming languages as a puzzle.

In this blog post series, I’ll walk you through each of them in turn, showcasing some fun unique features about every language (well almost). We will go through:

Python, the reference implementation. Simple, imperative, and thus perfect as our baseline.
SBCL (Common Lisp). We will utilize bitsets, macros, and optimizations hints, to show how we can make the solver both expressive and performant.
SWI-Prolog. We will leave the whole backtracking and solving business to Prolog. The least amount of code, the most magical.
Haskell. Datatypes and fancy functional magic to make the solver elegant if slightly impenetrable.
Rust. PERFORMANCE. Minimal allocations and copying.
Elixir. Because I wanted to. Also, functional parallelism.

I’ll make the solvers publicly available in this repo as the series progresses.

Sounds fun? Well, then let’s buckle up and get started!

Why Python?

I’ve chosen Python as the reference implementation mostly because it’s a lingua franca: most of my readers will understand what’s going on syntax-wise, and we can focus on the algorithm. Apart from that, I wanted the following properties for our reference:

Imperative: we want to tell the system exactly what to do and how to do it. Abstractions come later.
Concise, but readable: we want to be able to implement the algorithm in, say, 50 lines or less.
We want a clear API we can emulate in the other languages, unless the language itself pushes us in another direction.
We do not need the optimal solving algorithm, but the simplest (that I can come up with) that will solve (hopefully) all solvable Sudokus. I didn’t verify my solvers against the state of the art, though, so YMMV.

Number 3 might be a bit vague, but I have high hopes that it will make more sense later on.

High level description

What do we want to implement?

We want a simple solver that is good enough to solve even very hard 9x9 Sudoku boards. To facilitate this, we will have to rely on depth-first search with backtracking (if that doesn’t tell you anything, wait for the code).

We will represent the input and output as nested arrays of size 9x9 containing integers. Empty cells will contain a 0, filled cells will contain the numbers 1-9.

We will only work on Sudoku puzzles that are on 9x9 boards, with 3x3 boxes all around. If you want to support other sizes, that might be a fun little challenge.

The implementation

For the implementation, we have to look at three chunks of work: the high level solver, the propagation of filled fields, and candidate selection for our recursive descent.

The solver

Alright, I think we are ready to look at some code. The entry point for our solver will be a function called solve() that takes the board as its argument:

def solve(board):
    if not propagate(board):
        return None

    if all(0 not in row for row in board):
        return board

    (i, j), cands = find_mrv_cell(board)
    for v in cands:
        new_board = [row[:] for row in board]
        new_board[i][j] = v
        solved = solve(new_board)

        if solved is not None:
            return solved

    return None

Well, that doesn’t make a tremendous amount of sense, does it? Let’s go through it top-to-bottom.

propagate actually represents a core solver step. We will look at what it does exactly next, but it basically goes over the board and tries to fill it out. Returning False here means the board is unsolvable, so we exit early. The all() expression then checks if the board is solved. If so, we return it (we wouldn’t have to, since we modify it in place, but oh well).

If we didn’t solve it, we have to be a bit more clever in our approach, and we begin our descent (hopefully not into madness!). We find candidate cells to fill out and try to solve with those set. How those candidates are selected, we will discover in a moment. For now, we will just notice that if one of them works out, we return it, and if no solution can be found, we return None again. As a bit of ceremony, we have to copy the board as well, because otherwise we’ll continue operating on the same board across descent steps, and that just wouldn’t work.

Propagation

Propagation basically mimics how many people go about solving a Sudoku puzzle: we go through the individual cells and see if we can fill them out given the information we have already in cells. We will do this until we cannot fill anything anymore.

def propagate(board):
    changed = True
    while changed:
        changed = False
        for i in range(9):
            for j in range(9):
                if board[i][j] == 0:
                    c = candidates(i, j, board)
                    l = len(c)
                    if l == 0:
                        return False
                    if l == 1:
                        board[i][j] = list(c)[0]
                        changed = True
    return True

Again, this is not the most straightforward code, but I’ll break it down for you.

We iterate until we cannot change anything in the board anymore (that is what changed represents). We set it to true to begin with to kick off the loop, and then set it to false at the start of every loop. When we fill anything out, it gets flipped again and we can go over the board once more. This will repeat until we went over the whole baord once without changing anything, our stop condition.

Inside that loop, we have another nested loop that goes over each cell. For each cell that if not filled out, we will calculate suitable candidates using candidates() (more below). If we do not find one, it means that the puzzle is unsolvable in its current state, and we return False. If we find a single candidate, we set the board to it (ignore the call to list(), that’s just ceremony), and signal that we should iterate again.

And that’s basically it! But how do we find suitable candidates?

Finding candidates

DIGITS = set(range(1, 10))

def grid_for(i, j, board):
    r0 = i - i % 3
    c0 =j - j % 3
    return {board[r][c]
    		 for r in range(r0, r0 + 3)
    		 for c in range(c0, c0 + 3)}


def candidates(i, j, board):
    used = (set(board[i]) |
    	{row[j] for row in board} |
    	grid_for(i, j, board))
    used.discard(0)
    return DIGITS - used

In candidates(), we first collect all the numbers we find across rows, columns, and cells into a set (and remove 0 for good measure). We then effectively invert that selection by removing it from the set of all the possible digits (provided in DIGITS).

The only complicated thing here is subdividing the cells, which we do with a bit of modulo magic in grid_for. It basically relies on the fact that we can get to the nearest smaller number that’s divisible by 3 by subtracting the number modulo 3 from it. If that doesn’t help explain that part, feel free to skip it, it’s not really material for your understanding of the algorithm.

Recursive descent candidate selection

Lastly, we need to implement another candidate selection: the one for our recursive descent. That one is pretty concise, but a bit of a headscratcher:

def find_mrv_cell(board):
    opts = [((i, j), candidates(i, j, board))
            for i in range(9)
            for j in range(9)
            if board[i][j] == 0]
    return min(opts, key=lambda t: len(t[1]))

I trust I don’t have to explain that code and we can just move on, right?

Jokes aside, the code is actually quite simple: for each cell that is unset, we collect the index and the candidates. We then return the cell that has the fewest candidates left (both the index and the list of candidates).

To be fair, this is not the most readable piece of code I’ve ever written, but it might be a good puzzle in itself.

And that’s it! We’ve implemented a Sudoku solver together!

Fin

In this blog post, we implemented our first Sudoku solver together. This one might be the hardest for you to get through, because we had to implement the algorithm for the first time. In the coming posts, we can focus on all the weird and wonderful things we can do within those constraints with our various tools.

A small note on cadence: I plan to write and publish a new post about a solver every 10 days or so. The solvers are already implemented, I just have to nail them down and iterate for the blog post.

I hope I was able to whet your appetite for solving Sudokus with me, and I’ll see you for our next post on solving in Common Lisp!