Average sorting

Suppose that, instead of sorting an array, we just require that the elements increase on average. More precisely, we call an \(n\)-element array \(A\) \(k\)-sorted if, for all \(i = 1, 2, \ldots, n-k\), the following holds:

\[\frac{\sum_{j=i}^{i+k-1} A[j]}{k} \leq \frac{\sum_{j=i+1}^{i+k} A[j]}{k}\]
  1. What does it mean for an array to be 1-sorted?
  2. Give a permutation of the numbers 1, 2, …, 10 that is 2-sorted, but not sorted.
  3. Prove that an \(n\)-element array is \(k\)-sorted if and only if \(A[i] \leq A[i+k]\) for all \(i = 1, 2, \ldots, n-k\).
  4. Give an algorithm that \(k\)-sorts an \(n\)-element array in \(O(n \lg(n/k))\) time.
  5. Show that we can sort a \(k\)-sorted array of length \(n\) in \(O(n \lg k)\) time. (Hint: Use the solution to Exercise 6.5-9.)
  6. Show that when \(k\) is a constant, \(k\)-sorting an \(n\)-element array requires \(\Omega(n \lg n)\) time. (Hint: Use the solution to the previous part along with the lower bound on comparison sorts.)

The \(k\)-sorted condition is a relaxed version of “sorted.” Instead of requiring every consecutive pair to be in order, it only requires that sliding windows of size \(k\) have non-decreasing averages. Larger \(k\) means a more relaxed condition.

A. 1-sorted array

For \(k = 1\), the condition becomes \(A[i] \leq A[i+1]\) for all \(i = 1, 2, \ldots, n-1\). This is simply a fully sorted array.

B. 2-sorted but not sorted permutation

Consider: \(\langle 1, 3, 2, 4, 6, 5, 7, 9, 8, 10 \rangle\)

The 2-sorted condition (from part c) requires \(A[i] \leq A[i+2]\): we have \(1 \leq 2\), \(3 \leq 4\), \(2 \leq 6\), \(4 \leq 5\), \(6 \leq 7\), \(5 \leq 9\), \(7 \leq 8\), \(9 \leq 10\). All hold. But the array is not sorted since \(A[2] = 3 > 2 = A[3]\).

C. Proof of equivalence

The \(k\)-sorted condition states that the average of each window of \(k\) consecutive elements is at most the average of the next window (shifted by one position). Expanding the two sums and cancelling the \(k - 1\) terms they share reveals a clean equivalence.

The left sum is \(A[i] + A[i+1] + \cdots + A[i+k-1]\) and the right sum is \(A[i+1] + \cdots + A[i+k-1] + A[i+k]\). The terms \(A[i+1]\) through \(A[i+k-1]\) appear on both sides, so the inequality reduces to \(A[i] \leq A[i+k]\).

Conversely, if \(A[i] \leq A[i+k]\) holds for all valid \(i\), then adding \(A[i+1] + \cdots + A[i+k-1]\) to both sides and dividing by \(k\) recovers the \(k\)-sorted condition.

D. Algorithm to \(k\)-sort in \(O(n \lg(n/k))\) time

From part (c), an array is \(k\)-sorted exactly when \(A[i] \leq A[i+k]\) for all \(i\). This means each of the \(k\) interleaved subsequences (elements at positions \(i, i+k, i+2k, \ldots\) for \(i = 1, 2, \ldots, k\)) must be individually sorted.

$$\textsc{K-Sort}(A, n, k)$$$$\begin{aligned} 1& \quad \textbf{for }\,i\,=\,1\textbf{ to }\,k \\ 2& \quad \qquad \text{sort}\,\text{the}\,subsequence\,A[i]\,,\,\,\,A[i + k]\,,\,\,\,A[i + 2k]\,,\,\,\,\dots \\ \end{aligned}$$

Each of the \(k\) subsequences has approximately \(n/k\) elements. Sorting each takes \(O((n/k) \lg(n/k))\) time with any comparison sort. The total is:

\[k \cdot O\!\left(\frac{n}{k} \lg \frac{n}{k}\right) = O\!\left(n \lg \frac{n}{k}\right)\]

E. Sorting a \(k\)-sorted array in \(O(n \lg k)\) time

From part (c), in a \(k\)-sorted array, \(A[i] \leq A[i+k]\) for all valid \(i\). This means each element is at most \(k\) positions away from its final sorted position. We can exploit this with a min-heap of size \(k\), following the approach from Exercise 6.5-9.

$$\textsc{SortKSorted}(A, n, k)$$$$\begin{aligned} 1& \quad \text{let}\,B[1..n]\,\text{be}\,a\,\text{new}\,\text{array} \\ 2& \quad build\,a\,\text{min-heap}\,H\,\text{from}\,A[1..k] \\ 3& \quad \textbf{for }\,i\,=\,1\textbf{ to }\,n\,-\,k \\ 4& \quad \qquad B[i]\,=\,\textsc{Extract-Min}(H) \\ 5& \quad \qquad \textsc{Min-Heap-Insert}(H, \, A[i + k]) \\ 6& \quad \textbf{for }\,i\,=\,n\,-\,k\,+\,1\textbf{ to }\,n \\ 7& \quad \qquad B[i]\,=\,\textsc{Extract-Min}(H) \\ 8& \quad \textbf{return }\,B \\ \end{aligned}$$

The heap always contains at most \(k\) elements, so each \(\textsc{Extract-Min}\) and \(\textsc{Min-Heap-Insert}\) takes \(O(\lg k)\) time. With \(n\) extractions and at most \(n\) insertions, the total is \(O(n \lg k)\).

F. Lower bound of \(\Omega(n \lg n)\) for constant \(k\)

Suppose we could \(k\)-sort in \(o(n \lg n)\) time for constant \(k\). From part (e), we could then fully sort the result in \(O(n \lg k) = O(n)\) time (since \(k\) is constant and \(\lg k\) is constant). The combined algorithm would fully sort in \(o(n \lg n) + O(n) = o(n \lg n)\) time, contradicting the \(\Omega(n \lg n)\) comparison-sort lower bound. Therefore, \(k\)-sorting requires \(\Omega(n \lg n)\) time when \(k\) is constant.