Explain why the worst-case running time for bucket sort is \(\Theta(n^2)\). What simple change to the algorithm preserves its linear average-case running time and makes its worst-case running time \(O(n \lg n)\)?

A.

The worst case occurs when all \(n\) elements end up in a single bucket. Bucket sort then reduces to sorting that one bucket using insertion sort, which takes \(\Theta(n^2)\) in the worst case.

This can happen when the input is not uniformly distributed. For example, if \(A = \langle .51, .52, .53, .54, .55, .56, .57, .58, .59 \rangle\), all elements fall into bucket 5, and we must sort all 9 elements using insertion sort.

B.

Replace insertion sort with merge sort (or heapsort) for sorting individual buckets. Each bucket with \(k\) elements now takes \(O(k \lg k)\) time to sort, so in the worst case where all \(n\) elements land in one bucket, sorting takes \(O(n \lg n)\) time. The rest of the algorithm (distributing into buckets and concatenating) still takes \(\Theta(n)\) time, giving \(O(n \lg n)\) worst-case overall.

The average-case analysis is unaffected because it depends on the expected distribution of elements across buckets, not on which sorting algorithm we use within each bucket. When input is uniformly distributed, we expect \(O(1)\) elements per bucket, and sorting \(O(1)\) elements takes \(O(1)\) time whether we use insertion sort or merge sort. Total expected time remains \(\Theta(n)\).