In this post I want to demonstrate the power of recursive programming by programming a sudoku solver in Python. The complete sourcecode can also be downloaded here.
Although I like solving Sudokus by hand, I’m far away from being a professional Sudoku solver. Usually I get the medium-ones solved, however, since I always want to solve them without applying marks I usually don’t crack the hard ones.
However: Here this lack of intuition doesn’t prevent us from writing a solver in a few lines of code to crack any possible Sudoku. Here we’re not dealing with heuristics, but we’re rather doing some informed deep search. This approach is also called full search or back tracking, since the whole solution space is explored and if at any given time constraints are being violated, further processing is aborted (backtracking).
What’s the basic Idea?
The basic idea is to solve the problem with recursion. Recursive programs contain methods, which need to call themselves in order to solve a given problem. The most wellknown example to demonstrate recursive programming is the fibonacci sequence:
def fib(a):
if a == 0:
return 0
if a == 1:
return 1
return fib(a-2) + fib(a-1)
The fibonacci sequence starts with the numbers 0 and 1, all other numbers are calculated based on the sum of their previous two fibonacci numbers. As you can see, the function fib() calls itself in order to do the job.
Now let’s move to our Sudoku problem. We have to think of a method, which can be called recursively and always finds itself in front of the same task. There should not be a need to exchange data other than the Sudoku itself, we want to solve. With that in mind, let’s think of a way, how we can divide the sudoku solving task into steps, which are always the same and finally lead to a solved Sudoku:
- Determine the next empty cell
- Determine all potential candidates (all numbers, we are allowed to write into this cell)
- Loop through all candidates and do something
The only part which is a bit tricky is the loop. But it’s easier to evolve our thoughts when we see the code emerging in front of our mind and eyes, so let’s just start coding:
def solve(sudoku, rowId = 0, colId = 0):
# first determine next empty cell starting from rowId /
# colId, if current cell is not empty (!=0)
if sudoku[rowId, colId] != 0:
# field already set, find next empty cell
rowId, colId = findNextEmptyCell(sudoku, rowId, colId)
if rowId == None or colId == None:
...?-1-?...
# determine potential candidates for given cell
cands = getCandidates(sudoku, rowId, colId)
# loop through all candidates
for cand in cands:
sudoku[rowId, colId] = cand
sudoku = solve(sudoku, rowId, colId)
..?-2-?..
..?-3-?..
The code above does just the things we’ve described in the list above. Now there are some gaps, which we have to determine how to react.
Gap 1: Here we have to define, what the code should do, if there aren’t any empty cells left to solve. What do we need to do here? Of course! The Sudoku is solved and we want to return it. So let’s now adress the other challenges.
Gap 2: In this code location we have just filled out a cell with a potential candidate and called the solve() – function recursively. What do we have to check here? Of course we want to know, whether the Sudoku is solved or not. It might be, that we’d just filled out the last cell. If the solution returned here is complete, we do not want to try other candidates.
Gap 3: In this code location we’ve tested all potential candidates for the current cell, but none of them succeeded in a successful attempt. What does that tell us? Of course we know, that the Sudoku cannot be solved on the given basis. We thus have to delete the current cell again and do backtracking, that is, return the Sudoku with the cleared cell. Somewhere in higher call-hierarchies the function will have to try other candidates, in order to make the Sudoku solvable.
Now let’s impelment these gaps as just discussed:
def solve(sudoku, rowId = 0, colId = 0):
# first determine next empty cell starting from rowId /
# colId, if current cell is not empty (!=0)
if sudoku[rowId, colId] != 0:
# field already set, find next empty cell
rowId, colId = findNextEmptyCell(sudoku, rowId, colId)
if rowId == None or colId == None:
return sudoku
# determine potential candidates for given cell
cands = getCandidates(sudoku, rowId, colId)
# loop through all candidates
for cand in cands:
sudoku[rowId, colId] = cand
sudoku = solve(sudoku, rowId, colId)
if isSolved(sudoku):
return sudoku
sudoku[rowId, colId] = 0
return sudoku
Congratulations! The hard work is done! Now there are just some helper functions to be implemented to determine the next empty cell of a given sudoku or the potential candidates for a given empty cell. I’ll just present the complete source code here, together with a main block that passes the sudoku in the above picture (a hard one! :P):
import numpy as np
def getCandidates(sudoku, rowId, colId):
allowed = set(range(1, 10))
allowed.difference_update(sudoku[:,colId])
allowed.difference_update(sudoku[rowId,:])
# determine quadrant items
qRow = rowId // 3
qCol = colId // 3
quadSet = sudoku[3*qRow:3*qRow+3, 3*qCol:3*qCol+3]
allowed.difference_update(quadSet.flatten())
return allowed
def findNextEmptyCell(sudoku, rowId, colId):
row = rowId
col = colId
for i in range(row,9):
for j in range(col,9):
if sudoku[i,j] == 0:
return i, j
col = 0
return None, None
def isSolved(sudoku):
return sudoku.flatten().sum() == 405
def solve(sudoku, rowId = 0, colId = 0):
# first determine next empty cell starting from rowId / colId, if current cell is not empty (!=0)
if sudoku[rowId, colId] != 0:
# field already set, find next empty cell
rowId, colId = findNextEmptyCell(sudoku, rowId, colId)
if rowId == None or colId == None:
return sudoku
# determine potential candidates for given cell
cands = getCandidates(sudoku, rowId, colId)
# loop through all candidates and
for cand in cands:
sudoku[rowId, colId] = cand
sudoku = solve(sudoku, rowId, colId)
if isSolved(sudoku):
return sudoku
sudoku[rowId, colId] = 0
return sudoku
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
sudoku = np.array([[0,7,2,4,8,5,0,0,0],
[4,0,8,2,0,0,0,0,0],
[5,0,0,0,0,9,4,0,0],
[0,0,5,0,0,1,0,0,8],
[0,0,0,0,6,0,0,0,0],
[1,0,0,5,0,0,9,0,0],
[0,0,4,1,0,0,0,0,5],
[0,0,0,0,0,4,3,0,7],
[0,0,0,7,3,8,2,1,0]])
result = solve(sudoku)
print(result)
Some additional Notes
This solution is very simple to implement and to come up with and here it does the trick. However, for large search space tasks as we find them in NP-hard problems this approach usually doesn’t help much. The way we solve the Sudoku here is just similar to the methodology one would pick a combination lock:
We start with the first empty cell, try candidate 1, go to the next empty cell, try candidate 1, and so on. If no more candidates are available we go one step back, try the second candidate,… Finally, we will end up with a solved Sudoku.
For this reason in more complex problems we usually apply other approaches like the implementation of heuristics, metaheuristics, construction- & improvement heuristics like simulated annealing, ant colony optimization, genetic algorithms and the like.