Fuzzy sorting of intervals
Consider a sorting problem in which we do not know the numbers exactly. Instead, for each number, we know an interval on the real line to which it belongs. That is, we are given \(n\) closed intervals of the form \([a_i, b_i]\), where \(a_i \leq b_i\). We wish to fuzzy-sort these intervals, i.e., to produce a permutation \(\langle i_1, i_2, \ldots, i_n \rangle\) of the intervals such that for \(j = 1, 2, \ldots, n\), there exist \(c_j \in [a_{i_j}, b_{i_j}]\) satisfying \(c_1 \leq c_2 \leq \cdots \leq c_n\).
- Design a randomized algorithm for fuzzy-sorting \(n\) intervals. Your algorithm should have the general structure of an algorithm that quicksorts the left endpoints (the \(a_i\) values), but it should take advantage of overlapping intervals to improve the running time. (As the intervals overlap more and more, the problem of fuzzy-sorting the intervals becomes progressively easier. Your algorithm should take advantage of such overlapping, to the extent that it exists.)
- Argue that your algorithm runs in expected time \(\Theta(n \lg n)\) in general, but runs in expected time \(\Theta(n)\) when all of the intervals overlap (i.e., when there exists a value \(x\) such that \(x \in [a_i, b_i]\) for all \(i\)). Your algorithm should not be checking for this case explicitly; rather, its performance should naturally improve as the amount of overlap increases.
Fuzzy sorting generalizes traditional sorting to handle imprecise data represented as intervals. The key insight is that overlapping intervals can be grouped together, reducing the problem size.
A.
The algorithm adapts quicksort’s partitioning to exploit interval overlap. Instead of partitioning around a single value, we partition around an interval.
The procedure works as follows. It selects a random interval as the pivot, then partitions the intervals into three groups: those completely before the pivot’s left endpoint (the left group), those that overlap the pivot (the overlapping group), and those completely after the pivot’s right endpoint (the right group). As we identify overlapping intervals, we track their common overlap region \([left, right]\), which shrinks as we intersect more intervals into it.
It returns the indices \((q, t)\) separating the three groups, where \(I[p..q-1]\) are left intervals, \(I[q..t]\) overlap, and \(I[t+1..r]\) are right intervals. We then recurse only on the left and right groups; the overlapping group is already correctly positioned, since those intervals can all take a common value from their shared intersection.
B.
In the general case the running time is \(\Theta(n \lg n)\). When intervals have minimal overlap, fuzzy-sort behaves like standard quicksort:
- Each partition operation takes \(O(n)\) time
- With random pivot selection, we expect balanced partitions
- Recursion depth is \(O(\lg n)\)
- Total time: \(O(n \lg n)\)
When all intervals overlap, the running time drops to \(\Theta(n)\). Suppose there exists a point \(x\) such that \(x \in [a_i, b_i]\) for all \(i\). Then:
- First partition: The pivot interval contains \(x\). Every other interval also contains \(x\), so all intervals overlap with the pivot.
- All intervals grouped together: After one partition, all \(n\) intervals are identified as overlapping and placed in the middle group \(I[q..t]\) where \(t = r\).
- No recursion needed: Since \(q = 1\) and \(t = n\), there are no left or right subgroups to sort.
- Total time: One partition pass through \(n\) intervals takes \(\Theta(n)\) time.
For intermediate cases, as overlap increases, more intervals get grouped together in each partition, reducing the recursion tree size. If each partition groups \(k\) intervals on average:
- Effective problem size reduces by factor of \(k\)
- Recursion depth becomes \(\lg(n/k)\)
- Total time: \(O(n \lg(n/k))\)
When \(k = n\) (complete overlap), this reduces to \(O(n \lg 1) = O(n)\).
The algorithm adapts naturally because it doesn’t explicitly check for overlap. Instead:
- More overlap \(\Rightarrow\) larger overlapping groups \(\Rightarrow\) fewer recursive calls
- Complete overlap \(\Rightarrow\) all intervals grouped in one partition \(\Rightarrow\) no recursion
- The performance improvement emerges naturally from the partitioning logic
This elegant property makes fuzzy-sort adaptive to the input structure without requiring explicit detection mechanisms.