To understand backtracking, we need to familiarize ourselves with some key terms:
State Space: This is the set of all possible states in our problem. Think of it as all the possible configurations or situations we might encounter while solving the problem.
State Space Tree: If we were to draw the state space using a tree structure, we would get a state space tree.
Candidate: A candidate is a potential choice or element that we can add to our current partial solution.
Solution: A complete and valid answer to our problem. This is often referred to as a "candidate solution" because it comprises the candidates we've selected at each step of the backtracking process.
Dead-end: A state where we can't proceed further towards a valid solution.
Terminal Condition: This is the condition that tells us when to stop exploring a particular path. It could be because we've found a solution or because we've reached a dead end.
A backtrack step: This is the process of returning to a previous state in the algorithm's execution path when a terminal condition is reached to explore alternative possibilities.
Pruning: This is a technique we use to eliminate unnecessary exploration of the state space. It's like taking shortcuts in our search by avoiding paths we know won't lead to a solution.
These terms might sound confusing at first glance, so let's use an example to explain them in more depth.
One of the backtracking problems is finding all permutations of an array of numbers. For example, for the input [1, 2 ,3]
, all possible permutations are [[1, 2 ,3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]
. We'll use this problem to explain the terminology from above.
The state space encompasses all possible paths we could take to reach a solution, including both valid and invalid paths. In our permutation problem, this includes all possible arrangements of the numbers, even those that repeat digits. To make this abstract concept more tangible, we visualize it as a tree structure, which we call the state space tree.
Candidates are all possible elements we could add to our solution at each level of the tree. These are the choices available to us at any given point in our backtracking process. At the root level, our candidates are numbers 1
, 2
, and 3
; we can start our permutation with any of these numbers. This is why we see three branches stemming from the root.
At the second level, for each branch from the root, we again have three candidates to choose from. This results in nine total candidates at the second level (three for each of the three first-level nodes). It's important to note that at this stage, we're considering all possibilities, including invalid ones like choosing the same number twice.
Terminal Conditions are reached at the leaf nodes of our state space tree, representing the end points of our exploration paths. These conditions mark the moment when we can no longer extend our current candidate solution and must make a decision. In our permutation problem, a terminal condition is reached when we've selected three numbers, as this completes a potential permutation. At this point, we need to evaluate the completed candidate solution.
Solutions are those terminal conditions that satisfy all the constraints of our problem. In the context of our permutation task, a solution is a complete arrangement of the numbers 1
, 2
, and 3
where each number is used exactly once.
Dead-ends (represented by X mark on the drawing), on the other hand, are terminal conditions that violate one or more of our problem constraints. In our permutation example, dead-ends would include arrangements like [1,1,1]
, [1,2,2]
, or [2,3,3]
, where numbers are repeated.
The entire tree, as shown above, represents naive branching logic. Naive branching logic refers to the approach where we explore every possible option at each step without any early pruning or optimization. Here's why we call it "naive":
A backtracking step occurs when the algorithm has reached a terminal condition (either a solution or a dead-end) and needs to return to a previous state to explore other possibilities. It's the "back" in "backtracking" - the mechanism that allows the algorithm to explore all potential solutions systematically. Once we reach a terminal condition, we "backtrack" by undoing the most recent choice (removing the last candidate from the potential solution) and thus returning to the previous state. Then, we move on to the next unexplored option at that level. Once all options at that level are explored, we backtrack further.
We start with []
as our initial state, represented by the root of the tree. From there, we add 1
to reach [1]
. We then add 1
again, reaching [1,1]
. In naive branching, we add 1
once more to get [1,1,1]
. At this point, we've reached a terminal condition (a complete but invalid permutation), triggering our first backtrack. This backtrack is shown by the dashed arrow returning to [1,1]
.
Next, we try our next option, reaching [1,1,2]
. Since this is also a terminal condition (dead-end), we backtrack to [1,1]
and try the third and final number 3
. However, [1,1,3]
is also a dead-end, so we backtrack twice, popping two values from the candidate array until we reach [1]
. Then we add the number 2
and get [1,2]
, and subsequently try the next number 1
, giving us [1,2,1]
. This is not a valid solution, so we backtrack again. This process continues until we explore all possible options, systematically generating all valid permutations.
Pruning is an optimization technique used in backtracking algorithms to eliminate branches of the state space tree that are guaranteed not to lead to a valid solution. Unlike naive branching logic, pruning allows us to skip exploring certain paths based on problem-specific knowledge or constraints. Let's see how pruning works:
In this pruned tree, we can see how the algorithm avoids unnecessary exploration. The 'X'
marks indicate branches that are pruned and not explored further. These pruned branches represent paths where a number would be repeated in the permutation, which we know cannot lead to a valid solution. For instance, under the root node 1
, we immediately prune the left branch, as it would lead to [1,1]
, an invalid permutation. We only explore paths that don't repeat numbers, significantly reducing the number of branches we must traverse.