Why do we analyze the expected running time of a randomized algorithm and not its worst-case running time?
For randomized algorithms, worst-case analysis provides little useful information because it depends on an unlikely alignment of both the input and the random choices. Expected running time gives a more meaningful performance guarantee.
Consider \(\textsc{Randomized-Quicksort}\). Its worst-case running time is still \(\Theta(n^2)\), which can occur if we’re extraordinarily unlucky with every random pivot choice. However, this worst case requires selecting the worst possible pivot (either the minimum or maximum element) at every single level of recursion. The probability of this happening is astronomically small, decreasing exponentially with the input size.
For an array of size \(n\), the probability of selecting the worst pivot at one level is \(2/n\) (either the minimum or the maximum). For this to happen at all levels, we’d need approximately \(\lg n\) consecutive worst-case choices (if we’re very unlucky) or up to \(n\) levels (if we’re catastrophically unlucky). Even in the more optimistic scenario:
\[\Pr[\text{worst case}] \approx \left(\frac{2}{n}\right)^{\lg n} = \frac{1}{n^{\lg n - 1}}\]This probability is negligible, about 5 in a trillion for even moderately sized array with \(n = 100\), far smaller than the probability of hardware failure or cosmic rays flipping bits in memory.
The expected running time of \(O(n \lg n)\) for \(\textsc{Randomized-Quicksort}\) on the other hand holds for every input. Unlike deterministic quicksort, which has bad inputs (sorted arrays), randomized quicksort has no worst-case input. An adversary cannot construct an input that guarantees bad performance because the algorithm’s behavior depends on random choices made during execution.