Strategy
The strategy for solving a puzzle may be regarded as comprising a combination
of three processes: scanning, marking up, and analyzing.
The 3×3 region in the top-right corner must contain a 5. By hatching across and
up from 5s located elsewhere in the grid, the solver can eliminate all of the
empty cells in the top-right corner which cannot contain a 5. This leaves only
one possible cell (highlighted in green).
Scanning
Scanning is performed at the outset and periodically throughout the solution.
Scans may have to be performed several times in between analysis periods.
Scanning consists of two basic techniques:
Cross-hatching: the scanning of rows (or columns) to identify which line in a
particular region may contain a certain number by a process of elimination. This
process is then repeated with the columns (or rows). For fastest results, the
numbers are scanned in order of their frequency. It is important to perform this
process systematically, checking all of the digits 1–9.
Counting 1–9 in regions, rows, and columns to identify missing numbers. Counting
based upon the last number discovered may speed up the search. It also can be
the case (typically in tougher puzzles) that the easiest way to ascertain the
value of an individual cell is by counting in reverse—that is, by scanning the
cell's region, row, and column for values it cannot be, in order to see which is
left.
Advanced solvers look for "contingencies" while scanning—that is, narrowing a
number's location within a row, column, or region to two or three cells. When
those cells all lie within the same row (or column) and region, they can be used
for elimination purposes during cross-hatching and counting (Contingency example
at Puzzle Japan). Particularly challenging puzzles may require multiple
contingencies to be recognized, perhaps in multiple directions or even
intersecting—relegating most solvers to marking up (as described below). Puzzles
which can be solved by scanning alone without requiring the detection of
contingencies are classified as "easy" puzzles; more difficult puzzles, by
definition, cannot be solved by basic scanning alone.
Candidates for each empty cell have been entered. Some cells have only one
candidate once obvious invalids have been excluded.
Marking up
Scanning stops when no further numbers can be discovered. From this point,
it is necessary to engage in some logical analysis. Many find it useful to guide
this analysis by marking candidate numbers in the blank cells. There are two
popular notations: subscripts and dots.
In the subscript notation the candidate numbers are written in subscript in the
cells. The drawback to this is that original puzzles printed in a newspaper
usually are too small to accommodate more than a few digits of normal
handwriting. If using the subscript notation, solvers often create a larger copy
of the puzzle or employ a sharp or mechanical pencil.
The second notation is a pattern of dots with a dot in the top left hand corner
representing a 1 and a dot in the bottom right hand corner representing a 9. The
dot notation has the advantage that it can be used on the original puzzle.
Dexterity is required in placing the dots, since misplaced dots or inadvertent
marks inevitably lead to confusion and may not be easy to erase without adding
to the confusion. Using a pencil would then be recommended.
An alternative technique that some find easier is to mark up those numbers that
a cell cannot be. Thus a cell will start empty and as more constraints become
known it will slowly fill. When only one marking is missing, that has to be the
value of the cell.
Analysis
The two main approaches to analysis are "candidate elimination" and
"what-if".
In elimination, progress is made by successively eliminating candidate numbers
from one or more cells to leave just one choice. After each answer has been
achieved, another scan may be performed—usually checking to see the effect of
the latest number. There are a number of elimination tactics, all of which are
based on the simple rules given above, which have important and useful
corollaries, including:
A given set of n cells in any particular block, row, or column can only
accommodate n different numbers. This is the basis for the "unmatched candidate
deletion" technique, discussed below.
Each set of candidate numbers, 1–9, must ultimately be in an independently
self-consistent pattern. This is the basis for advanced analysis techniques that
require inspection of the entire set of possibilities for a given candidate
number. Only certain "closed circuit" or "n×n grid" possibilities exist (which
have acquired peculiar names such as "X-wing" and "Swordfish", among others; see
List of Sudoku terms and jargon for more information). If these patterns can be
identified, elimination of candidate possibilities external to the grid
framework can sometimes be achieved.
One of the most common elimination tactics is "unmatched candidate deletion".
Cells with identical sets of candidate numbers are said to be matched if the
quantity of candidate numbers in each is equal to the number of cells containing
them; essentially, these are perfectly coincident contingencies. For example,
cells are said to be matched within a particular row, column, or region (scope)
if two cells contain the same pair of candidate numbers (p,q) and no others, or
if three cells contain the same triplet of candidate numbers (p,q,r) and no
others. The placement of these numbers anywhere else in the matching scope would
make a solution for the matched cells impossible; thus, the candidate numbers (p,q,r)
appearing in unmatched cells in the row, column or region scope can be deleted.
This principle also works with candidate number subsets—if three cells have
candidates (p,q,r), (p,q), and (q,r) or even just (p,r), (q,r), and (p,q), all
of the set (p,q,r) elsewhere in the scope can be deleted. The principle is true
for all quantities of candidate numbers.
A second related principle is also true — if each cell within a set of cells (in
a row, column or region scope) contains the same set of candidate numbers, and
if the number of cells is equal to the quantity of candidate numbers, the cells
and numbers are matched and only those numbers can appear in matched cells.
Other candidates in the matched cells can be eliminated. For example, if (p,q)
can only appear in 2 cells (within a specific row, column, region scope), other
candidates in the 2 cells can be eliminated.
The first principle is based on cells where only matched numbers appear. The
second is based on numbers that appear only in matched cells. The validity of
either principle is demonstrated by posing the question 'Would entering the
eliminated number prevent completion of the other necessary placements?'
Advanced techniques carry these concepts further to include multiple rows,
columns, and blocks. (See "X-wing" and "Swordfish" above.)
In the what-if approach, a cell with only two candidate numbers is selected, and
a guess is made. The steps above are repeated unless a duplication is found or a
cell is left with no possible candidate, in which case the alternative candidate
is the solution. In logical terms, this is known as reduction ad absurdum.
Nishio is a limited form of this approach: for each candidate for a cell, the
question is posed: will entering a particular number prevent completion of the
other placements of that number? If the answer is yes, then that candidate can
be eliminated. The what-if approach requires a pencil and eraser. This approach
may be frowned on by logical purists as trial and error (and most published
puzzles are built to ensure that it will never be necessary to resort to this
tactic,) but it can arrive at solutions fairly rapidly.
Ideally one needs to find a combination of techniques which avoids some of the
drawbacks of the above elements. The counting of regions, rows, and columns can
feel boring. Writing candidate numbers into empty cells can be time-consuming.
The what-if approach can be confusing unless you are well organized. The
proverbial Holy Grail is to find a technique which minimizes counting, marking
up, and rubbing out.
