Simple Sudoku Solvers SII, EI: Dyalog APL

Welcome to the first post of the second season of Simple Sudoku Solvers! If you are unfamiliar with the series, I suggest you start with the first post, otherwise buckle up, because we are going to look at a lot of strange symbols today!

Today we’re going to look at APL, a strange old language fallen out of time known for its terseness and strange alien symbols. Truth be told, I could never get into it, and still can’t. Nonetheless, writing a Sudoku solver seemed like the perfect weird challenge to get my hands dirty on and give it an honest shot. And now I’m going to share the bloody results with you!

Fair warning: in my efforts to make this code as “APL-like” as possible, I’ve done a few iterations of golfing on it. Now the program truly looks like APL, but might also be somewhat inscrutable to those not in the know. To be clear, if I hadn’t written this code, I probably would have trouble following it myself.

Another thing to note is that in this installment, contrary to prior episodes, I will try to explain more or less every expression, in an attempt to be a bit friendlier to my reader.

With all of the encouraging stuff out of the way, let’s dive right in!

Why APL?

APL is a language that looks like someone spilled a box of alien maths onto your screen and then insisted this was a good idea. And maybe they were right, I’m not smart enough to answer this question.

A few things make APL a fun fit for Sudoku even for mere mortals:

Everything is an array. Sudoku is a 9×9 array problem. Matrices and arrays are APL’s food. It sounds like a match made in heaven.
Decoding indices and slicing submatrices is built-in. We essentially just lean on basic operators to slice and dice our rows, columns, and boxes.
Set-ish operations are effortless. “Candidates are digits minus used digits” is basically one operator. So is everything else if you’re creative enough.
Conciseness is not optional. APL will not let you hide from your data model: if you choose a representation that’s awkward, it will show immediately. It is also unforgiving toward clarity of thought, which is both scary and cool.

This post is not an APL primer. Hopefully I’ll explain enough to follow along, but if you’ve never used APL before, the best strategy is: keep reading, copy/paste code into an interpreter, and allow yourself to be confused for a bit.

The dialect of APL I’m using is Dyalog APL, which seems to be the industry standard as far as I can tell. It’s a commercial offering, but you can download the engine and IDE for free if you’re using it mostly for play like me. It also comes with a program called dyalogscript, which is a way of running Dyalog programs from the terminal instead of their somewhat foreign IDE. Unfortunately it’s pretty underspecified and typing APL is awkward to begin with, so I spent way too much time trying to get this part to work. I recommend just running it in their environment instead.

The implementation

As always, we initially represent the board as a matrix (9x9), where 0 means “empty” and 1..9 are known digits. The solver is the basic shape of filling, MRV, and backtracking/depth-first search. This time, however, we’ll lean harder on array operations for candidate computation, and keep everything else simple and obvious (well, if you know how to read it).

In this installment, we’ll use the 9-bit boolean mask trick we’ve used in previous installments to add extra flavor and because its uniform length plays well with APL’s standard operations. It forces us to compute masks and convert stuff, but since our actual implementation is only ten ultra terse lines, we can afford it.

Importantly, internally we’ll deal with a single array of 81 values to facilitate quicker operations. So we’ll start by getting a matrix, and then flatten it, before printing it as a 9x9 matrix instead. The high-level looks like this (⍝ is the comment delimiter):

⍝ input matrix m (← is the assignment operator)
m←↑(
  (0 0 3 0 2 0 6 0 0)
  (9 0 0 3 0 5 0 0 1)
  (0 0 1 8 0 6 4 0 0)
  (0 0 8 1 0 2 9 0 0)
  (7 0 0 0 0 0 0 0 8)
  (0 0 6 7 0 8 2 0 0)
  (0 0 2 6 0 9 5 0 0)
  (8 0 0 2 0 3 0 0 9)
  (0 0 5 0 1 0 3 0 0)
)
⍝ flattened p
p←,m
⍝ S is the solver, 9 9⍴ prints it, assigning to ⎕ prints
⎕←9 9⍴S p

We also set indexing to be 0-based (it’s 1-based by default), and compute the space of possible digits D:

⍝ the ⋄ sigil separates multiple expressions on one line
⎕IO←0 ⋄ D←1+⍳9

With that out of the way, let’s look at the algorithm!

The main solver

Staying true to the terseness of APL, we named the main solver function S. It’s a unary function, and in APL we receive that argument as ⍵. With that out of the way, the rest of the algorithm should be self-evident:

S←{
  b←P ⍵ ⋄ 0=≢b:⍬ ⋄ ~0∊b:b
  i←⍸0=b ⋄ cs←↑(b∘C)¨i ⋄ l←+/cs ⋄ k←l⍳⌊/l ⋄ idx←i[k]
  sols←(⊂∘∇)¨({(⍵@idx)⊢b}¨cs[k;]/D)
  ⊃(((0<(≢∘⊃)¨sols)/sols),⊂⍬)
}

Sarcasm notwithstanding, the code is actually not as bad as it looks. We start by propagating using P—more on that below—, and assigning that to b. We return ⍬ if we get it from P here, since this is our unsatisfiability marker (0=≢b:⍬). If there are no zeros on the board, on the other hand, we are done and return the board b (~0∊b:b).

With the preconditions out of the way, we can move on to the main branching algorithm. We get the indices i of the cells set to zero (i←⍸0=b) and compute the candidates (cs←↑(b∘C)¨i) and their respective counts (l←+/cs). We then pick the cell with the fewest candidates (k←l⍳⌊/l) and find its index (idx←i[k]).

After all of this, we are ready to solve the branches. We turn the 9-bit mask into the candidate digits (cs[k;]/D), and produce a new board for all of them ({(⍵@idx)⊢b}¨, idx being the index, and @ the functional/copying update). We pipe that into the solver, box each result, and assign the vector of boxes to sols (sols←(⊂∘∇)¨). All of that is one expression.

Finally, we just need to return the first match. We unbox each solution and get its length ((≢∘⊃)¨sols), filter for the non-empty ones (0<), and take the first one, returning ⍬ if we can’t find anything (⊂⍬).

Quick sidenote: you might ask yourself why we box sols just to unbox it again. This is due to a feature of Dyalog APL, which will happily mix the results of ∇¨ into a regular array if it can, which breaks our filter/take first logic. Dyalog’s shape coercion rule details are a bit much to get into here, but you can just think of it as “if the shapes fit, smush them”, and that’s not what we want here.

So, what we still need to define for this to click is the candidate search C and propagation P.

Propagation

We will now tackle propagation, which thankfully is a bit less involved (the next part will be simpler still!):

P←{
  b←⍵ ⋄ i←⍸0=b ⋄ 0=≢i:b
  cs←↑(b∘C)¨i ⋄ l←+/cs ⋄ 0∊l:⍬
  f←l=1 ⋄ ~∨/f:b
  b[f/i]←(f⌿cs)+.×D ⋄ ∇b
}

To remind ourselves: here we repeatedly fill any cell that has exactly one candidate until we cannot any longer. We return ⍬ on contradiction.

First, we assign our argument to b, because I’m a wuss who wants descriptive names (b←⍵). Then, we get the indices of empty cells again (i←⍸0=b, identical to above), and compute the candidates (cs←↑(b∘C)¨i, again the same) and their counts (l←+/cs). Here we need to guard against impossibility: if any empty cell has 0 candidates, the current board is impossible and we return ⍬ (0∊l:⍬). Then we get the boolean masks of all cells with exactly one candidate (f←l=1) and again guard: if there are no cells we can fill, we return the board (~∨/f:b).

Finally, we can fill the board (b[f/i]←(f⌿cs)+.×D)! This boils down to just getting the indices of what we need to fill (f/i), selecting the corresponding masks (f⌿cs), and converting it to a digit by computing the dot product between rows and D ((+.×D)). The rest is an array assignment.

The last expression is tail recursion: we will try to go for another round of propagation (∇b).

Candidate search

Compared to the previous snippets, candidate search is almost easy to grasp:

C←{
  m←9 9⍴⍺ ⋄ r c←9 9⊤⍵ ⋄ r0 c0←3×⌊(r c)÷3
  ~(D∊0~⍨∪m[r;],m[;c],,m[r0+⍳3;c0+⍳3])
}

Here we get a board and index and compute a mask. In “dyadic” APL functions (one argument goes left and one goes right in the call), the arguments are ⍺ and ⍵.

So first we reshape the board back into a 9x9 matrix (m←9 9⍴⍺), because rows, columns, and boxes fall out of that more easily. We then decode the index into a row/column index pair (r c←9 9⊤⍵) that we can use for the matrix and get the start index of the box (r0 c0←3×⌊(r c)÷3, either 0, 3, or 6). We then collect the row, column, and box values (m[r;], m[;c], ,m[r0+⍳3; c0+⍳3]), make sure they’re unique (∪), and remove 0 (0~⍨). We then turn it into a candidate mask (~(D∊<expr>)) and return it!

In my opinion, this is where APL truly shines. Being able to easily express column, row, and box indices is a superpower you only get from a language that takes arrays and matrices as fundamental.

And this concludes our tour de force! Congratulations, you made it!

Fin

I don’t really know much about APL, but I think we did a pretty good job of staying faithful to how you can work with it. I probably missed a whole bunch of things that are obvious to more experienced sigil slingers, but the easy immutable updating using @, incredibly simple candidate selection, and very declarative search algorithm (despite the scary syntax, it felt like just describing “solve map over candidates, then pick first non-empty“) really did resonate with me.

Thank you for tuning in for the first part of the second season of our little Sudoku solver series. The next few episodes should be somewhat less esoteric while still scratching that “mind-bending” itch. See you around!