Hoare partition correctness
The version of \(\textsc{Partition}\) given in this chapter is not the original partitioning algorithm. Here is the original partition algorithm, which is due to C. A. R. Hoare:
$$\textsc{Hoare-Partition}(A, p, r)$$$$\begin{aligned} 1& \quad x\,=\,A[p] \\ 2& \quad i\,=\,p\,-\,1 \\ 3& \quad j\,=\,r\,+\,1 \\ 4& \quad \textbf{while }\,\text{TRUE} \\ 5& \quad \qquad repeat \\ 6& \quad \qquad \qquad j\,=\,j\,-\,1 \\ 7& \quad \qquad until\,A[j]\,\leq\,x \\ 8& \quad \qquad repeat \\ 9& \quad \qquad \qquad i\,=\,i\,+\,1 \\ 10& \quad \qquad until\,A[i]\,\geq\,x \\ 11& \quad \qquad \textbf{if }\,i\,<\,j \\ 12& \quad \qquad \qquad \text{exchange}\,A[i]\,\text{with}\,A[j] \\ 13& \quad \qquad \textbf{else }\textbf{ return }\,j \\ \end{aligned}$$
- Demonstrate the operation of \(\textsc{Hoare-Partition}\) on the array \(A = \langle 13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21 \rangle\), showing the values of the array and auxiliary values after each iteration of the while loop in lines 4–13.
- The next three questions ask you to give a careful argument that the procedure \(\textsc{Hoare-Partition}\) is correct. Assuming that the subarray \(A[p..r]\) contains at least two elements, prove the following:
- The indices \(i\) and \(j\) are such that we never access an element of \(A\) outside the subarray \(A[p..r]\).
- When \(\textsc{Hoare-Partition}\) terminates, it returns a value \(j\) such that \(p \leq j < r\).
- Every element of \(A[p..j]\) is less than or equal to every element of \(A[j+1..r]\) when \(\textsc{Hoare-Partition}\) terminates.
- Rewrite the \(\textsc{Quicksort}\) procedure to use \(\textsc{Hoare-Partition}\).
Hoare’s original partition scheme uses two pointers that move toward each other from opposite ends of the array, swapping elements that are on the wrong side of the pivot. Let’s work through each part systematically.
A.
Starting with \(A = \langle 13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21 \rangle\) and pivot \(x = 13\):
| Iteration | \(i\) | \(j\) | Array after swap | Action |
|---|---|---|---|---|
| Initial | 0 | 13 | \(\langle 13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21 \rangle\) | \(x = 13\) |
| 1 | 1 | 11 | \(\langle 6, 19, 9, 5, 12, 8, 7, 4, 11, 2, 13, 21 \rangle\) | \(j\) stops at 11 (\(A[11]=6 \leq 13\)), \(i\) stops at 1 (\(A[1]=13 \geq 13\)), swap |
| 2 | 2 | 10 | \(\langle 6, 2, 9, 5, 12, 8, 7, 4, 11, 19, 13, 21 \rangle\) | \(j\) stops at 10 (\(A[10]=2 \leq 13\)), \(i\) stops at 2 (\(A[2]=19 \geq 13\)), swap |
| 3 | 10 | 9 | \(\langle 6, 2, 9, 5, 12, 8, 7, 4, 11, 19, 13, 21 \rangle\) | \(j\) stops at 9 (\(A[9]=11 \leq 13\)), \(i\) advances to 10 (\(A[10]=19 \geq 13\)), \(i \geq j\), return 9 |
Final partition point: \(j = 9\). Elements \(A[1..9] = \langle 6, 2, 9, 5, 12, 8, 7, 4, 11 \rangle\) are all \(\leq 13\), and elements \(A[10..12] = \langle 19, 13, 21 \rangle\) are all \(\geq 13\).
B.
We first show that the indices stay within bounds, that is \(p \leq i \leq j \leq r\) throughout execution.
Initially, \(i = p-1\) and \(j = r+1\). In the first iteration:
- The \(j\) loop decrements \(j\) at least once, so \(j \leq r\)
- The \(i\) loop increments \(i\) at least once, so \(i \geq p\)
The \(j\) loop stops when \(A[j] \leq x = A[p]\). Since \(A[p] \leq A[p]\), the loop must stop by \(j = p\) at the latest. Similarly, the \(i\) loop stops when \(A[i] \geq x\), and since \(A[p] = x\), it must stop by \(i = p\) if not sooner.
Once the loops find \(i\) and \(j\) with \(A[i] \geq x\) and \(A[j] \leq x\):
- If \(i < j\), we swap and continue. The next iteration starts with \(i < j\), both within \([p, r]\)
- If \(i \geq j\), we return \(j\)
The key insight: we always have \(A[p] = x\), which serves as a sentinel ensuring the \(i\) loop stops by position \(p\). Therefore, \(i\) and \(j\) always remain within \([p, r]\).
Next we show that the return value satisfies \(p \leq j < r\). When the algorithm terminates, we have \(i \geq j\). At this point, \(j\) has been decremented from \(r+1\) and stopped at some position where \(A[j] \leq x\).
Since the subarray has at least two elements (\(r > p\)) and \(i \geq j\):
- We know \(j < r+1\), so \(j \leq r\)
- We know \(i \geq p\), and since \(i \geq j\), we have \(j \geq p\) at termination
- Since \(i\) and \(j\) cross (\(i \geq j\)) and both started from opposite ends, we must have \(j < r\)
To see why \(j < r\): if \(j = r\), then \(A[r] \leq x\). But \(i\) starts from \(p-1\) and must reach at least \(p\) (where \(A[p] = x \geq x\)). For \(i \geq j = r\) to hold with \(r > p\), \(i\) must have passed \(r\), which contradicts our bounds. Therefore \(j < r\).
Combined: \(p \leq j < r\).
Finally we show the partitioning is correct. When the algorithm terminates with \(i \geq j\), we need to show all elements \(A[p..j] \leq\) all elements \(A[j+1..r]\).
The algorithm maintains the following invariant: after each swap, all elements at positions \(\leq\) the previous \(i\) values are \(\geq x\), and all elements at positions \(\geq\) the previous \(j\) values are \(\leq x\).
When \(i \geq j\), the two regions have met or crossed. At this point:
- All elements to the left of \(j\) have been confirmed to be \(\leq x\) or swapped with elements \(\leq x\)
- All elements to the right of \(i\) have been confirmed to be \(\geq x\) or swapped with elements \(\geq x\)
- Since \(i \geq j\), the regions overlap, ensuring a complete partition
Therefore, every element in \(A[p..j]\) is \(\leq x\), and every element in \(A[j+1..r]\) is \(\geq x\), which means every element in \(A[p..j]\) is \(\leq\) every element in \(A[j+1..r]\).
C.
The key difference from the standard quicksort is that the first recursive call includes position \(q\) (not \(q-1\)), while the second call starts at \(q+1\). This is because \(\textsc{Hoare-Partition}\) doesn’t necessarily place the pivot in its final position; it only guarantees that \(A[p..q]\) and \(A[q+1..r]\) are properly partitioned relative to each other.