Lower bound on merging sorted lists

The problem of merging two sorted lists arises frequently. We have seen a procedure for it as the subroutine \(\textsc{Merge}\) in Section 2.3.1. In this problem, we will prove a lower bound of \(2n - 1\) on the worst-case number of comparisons required to merge two sorted lists, each containing \(n\) items.

First we will show a lower bound of \(2n - o(n)\) comparisons by using a decision tree.

  1. Given \(2n\) numbers, compute the number of possible ways to divide them into two sorted lists, each with \(n\) numbers.
  2. Using a decision tree and your answer to part (a), show that any algorithm that correctly merges two sorted lists must perform at least \(2n - o(n)\) comparisons.

Now we will show a slightly tighter \(2n - 1\) bound.

  1. Show that if two elements are consecutive in the sorted order and from different lists, then they must be compared.
  2. Use your answer to the previous part to show a lower bound of \(2n - 1\) comparisons for merging two sorted lists.

A. Number of ways to divide \(2n\) numbers into two sorted lists

Given \(2n\) distinct numbers, we need to choose which \(n\) go into the first list. Once chosen, the order within each list is determined (both must be sorted). So the number of ways is simply the number of ways to choose \(n\) items from \(2n\):

\[\binom{2n}{n} = \frac{(2n)!}{n! \cdot n!}\]

B. Lower bound of \(2n - o(n)\) using decision tree

A merging algorithm must distinguish between all \(\binom{2n}{n}\) possible input configurations. In the decision tree model, the tree must have at least \(\binom{2n}{n}\) leaves, so its height \(h\) satisfies \(2^h \geq \binom{2n}{n}\), giving:

\[h \geq \lg \binom{2n}{n}\]

Using Stirling’s approximation, \(\binom{2n}{n} \approx 4^n / \sqrt{\pi n}\), so:

\[\lg \binom{2n}{n} = 2n - \frac{1}{2}\lg(\pi n) = 2n - o(n)\]

Any merging algorithm must make at least \(2n - o(n)\) comparisons in the worst case.

C. Consecutive elements from different lists must be compared

Let \(x\) and \(y\) be consecutive in the final sorted order (no element lies between them in value), with \(x\) from list \(A\) and \(y\) from list \(B\), and \(x < y\).

Suppose for contradiction that some correct merging algorithm never compares \(x\) and \(y\). Since they are never compared, the algorithm’s execution (which comparisons it makes, in what order) is identical for any input where \(x\) and \(y\) are swapped. Consider a modified input where we exchange the values of \(x\) and \(y\), so now \(y < x\). Both inputs still yield valid sorted lists (since \(x\) and \(y\) are consecutive, swapping them preserves the sorted order within each list). But the correct merged output differs between the two inputs: the original has \(x\) before \(y\), while the modified input has \(y\) before \(x\). Since the algorithm makes the same comparisons and produces the same output for both, it must be wrong on at least one. Contradiction.

D. Lower bound of \(2n - 1\) comparisons

Consider the worst case: the two lists interleave perfectly in the sorted output, like \(a_1, b_1, a_2, b_2, \ldots, a_n, b_n\) where \(a_i\) comes from list \(A\) and \(b_i\) from list \(B\). In the merged sorted order, every pair of consecutive elements comes from different lists. There are \(2n - 1\) consecutive pairs, and by part (c), each must be compared. Therefore, any algorithm requires at least \(2n - 1\) comparisons on this input.

Since the standard \(\textsc{Merge}\) procedure uses exactly \(2n - 1\) comparisons on this worst case (it compares once per output element except the last), the bound is tight.