Insertion Sort on Small Arrays in Merge Sort

Although merge sort runs in \(\Theta(n \lg n)\) worst-case time and insertion sort runs in \(\Theta(n^2)\) worst-case time, the constant factors in insertion sort can make it faster in practice for small problem sizes on many machines. Thus, it makes sense to coarsen the leaves of the recursion by using insertion sort within merge sort when subproblems become sufficiently small. Consider a modification to merge sort in which \(n/k\) sublists of length \(k\) are sorted using insertion sort and then merged using the standard merging mechanism, where \(k\) is a value to be determined.

  1. Show that insertion sort can sort the \(n/k\) sublists, each of length \(k\), in \(\Theta(n \lg (n/k))\) worst-case time.
  2. Show how to merge the sublists in \(\Theta(n \lg(n/k))\) worst-case time.
  3. Given that the modified algorithm runs in \(\Theta(nk + n \lg(n/k))\) worst-case time, what is the largest value of k as a function of n for which the modified algorithm has the same running time as standard merge sort, in terms of \(\Theta\) notation?
  4. How should we choose \(k\) in practice?

Sorting Sublists

For input of size \(k\), insertion sort runs on \(\Theta(k^2)\) worst-case time. So, worst-case time to sort \(n/k\) sublists, each of length \(k\), will be \(n/k \cdot \Theta(k^2) = \Theta(nk)\)

Merging Sublists

We have \(n\) elements divided into \(n/k\) sorted sublists each of length \(k\). To merge these \(n/k\) sorted sublists to get a single sorted list of length \(n\), we have to take 2 sublists at a time and continue to merge them.

This will result in \(\lg (n/k)\) steps (refer to Figure 2.5 in page 38 of the chapter text). And in every step, we are essentially going to compare \(n\) elements. So the whole process will run at \(\Theta(n \lg (n/k))\).

Largest Value of k

For the modified algorithm to have the same asymptotic running time as standard merge sort, \(\Theta(nk + n \lg(n/k))\) must be same as \(\Theta(n \lg n)\).

To satisfy this condition, \(k\) cannot grow faster than \(\lg n\) asymptotically, if it does then because of the \(nk\) term, the algorithm will run at worse asymptotic time than \(\Theta(n \lg n)\)).

So, let’s assume, \(k = \Theta(\lg n)\) and see if we can meet the criteria …

\[\begin {aligned} \Theta(nk + n \lg(n/k)) & = \Theta(nk + n \lg n - n \lg k) \\ & = \Theta(n \lg n + n \lg n - n \lg (\lg n)) \\ & = \Theta(2n \lg n - n \lg (\lg n)) ^\dagger \\ & = \Theta(n \lg n) \end {aligned}\]

\(^\dagger\lg (\lg n)\) is very small compared to \(\lg n\) for sufficiently larger values of \(n\).

Practical Value of k

To determine a practical value for \(k\), it has to be the largest input size for which insertion sort runs faster than merge sort. To get exact value, we need to calculate the exact running time expressions with the constant factors and use the method described in Exercise 1.2.2.