Sieve Sort
Sorting · Part 6 of 7
Window detection, measurement-first routing, and the proof that the philosophy works on any data type. (#6 of 12 in the unified benchmark)
Series: 1 · Disorder Clusters | 2 · The C Revelation | 3 · Who Decided What Sorted Is | 4 · Why Look When You Can Just Do | 5 · Where the Thread Ends | 6 · Sieve Sort | 7 · The Pendry Sort Paper
Pendry Sort is fast because it routes to non-comparison paths. Counting sort doesn't compare. Bucket scatter doesn't compare. Those paths depend on integer arithmetic. They can't handle floats, strings, structs, or anything with a custom comparator.
That limitation bothered me. The core philosophy (observe the data before choosing a strategy, extract everything from every touch) doesn't depend on integers. It's a design principle. It should work on any data type.
Sieve Sort is the test of that claim.
The question: Pendry Sort handles only int32. Can the philosophy survive without integer arithmetic? Sieve Sort answers yes: window detection + natural merge sort. Comparison-based. Type-agnostic. Optional counting sort path for integers.
The Window Detector
Before you sort anything, figure out what needs sorting.
Two passes. O(n) total. O(1) space.
Forward pass. Walk left to right tracking the running max. Every time an element is smaller than the running max, it's out of place. Mark that position as the right edge of the dirty window. Also track the global min and max.
Backward pass. Walk right to left tracking the running min. Every time an element is larger than the running min, mark that position as the left edge.
Result: everything outside the window is confirmed correct. Only the window needs work. If the forward pass never finds anything out of place, the array is already sorted. Cost: n−1 comparisons.
Pure form: The tenet from Post 1 in its purest expression. No windows of ±8. No inversion counting. No samples. Just: where does disorder start, and where does it end?
Three Paths
Tiny window (≤32 elements). Insertion sort. Always optimal at this size.
Low cardinality (value range ≤ window/4, range ≤ 1M). Counting sort. O(n + k). Integer-specialized. A generic comparator version skips this.
Everything else. Natural merge sort. Find ascending runs. Reverse descending runs in place. Extend short runs to minimum length 32. Merge bottom-up. On structured data: O(n log runs). On random: O(n log n).
Three paths. One handoff each. Earlier prototypes had inversion counting, threshold-based routing, and multiple fallbacks. They added complexity and created failure modes. Removing them was the fix.
Timsort family: Sieve Sort sits alongside Timsort and Smoothsort. Its distinguishing feature: the window detector as a preprocessing pass. Timsort runs directly into run detection. Sieve Sort identifies the dirty region first, then looks for runs only within it.
Sieve Sort v3: Unified Benchmark at 1M Elements
35 patterns. 1M int32 elements. 7 trials, median timing. Handles every pattern with zero N/A and zero FAIL.
Where Sieve Sort Wins
| Pattern | Sieve-v3 | Pendry | introsort | qsort |
|---|---|---|---|---|
| Outlier | 6.1ms | 127.1ms | 122.5ms | 78.5ms |
| Two Blocks | 7.1ms | 19.3ms | 17.8ms | 81.9ms |
| Scatter 5% | 7.2ms | 7.8ms | 18.5ms | 55.8ms |
| Scatter 10% | 7.2ms | 7.7ms | 18.4ms | 55.4ms |
Where It Pays the Cost of Generality
| Pattern | Sieve-v3 | Pendry | Gravity |
|---|---|---|---|
| Random | 102.5ms | 10.6ms | 10.6ms |
| Organ+Noise | 172.7ms | 10.0ms | 10.3ms |
| Zipf-like | 97.6ms | 10.0ms | 9.9ms |
The cost: natural merge sort's O(n) buffer allocation and per-merge overhead vs. Pendry's bucket scatter which bypasses comparison entirely on integer data.
Aggregate
| Metric | Value |
|---|---|
| Total time | 810,240 µs (#6 of 12) |
| Wins | 4 of 35 |
| N/A + FAIL | 0 + 0 |
| vs qsort | 3.4x faster aggregate |
| vs introsort | 2.0x faster aggregate |
Reading the Numbers
Look at what Sieve Sort wins. Outlier: 6.1ms vs Pendry's 127.1ms. That's a 20x advantage. The outlier pattern (one element displaced far from its origin) is Pendry Sort's worst pattern in the entire benchmark. Pendry's merged scan detects high inversions and bails to bucket scatter, which scatters everything across 125,000 buckets for a single misplaced element. Sieve Sort's window detector identifies the exact dirty region, finds it small, and insertion-sorts just that section.
Two Blocks: 7.1ms vs Pendry's 19.3ms. Sieve's window detector identifies the entire array as dirty, then natural merge sort finds two long ascending runs and merges them in one pass.
The losses are real too. On random data (102.5ms vs Pendry's 10.6ms), on organ+noise (172.7ms vs 10ms), on Zipf (97.6ms vs 10ms). Natural merge sort's buffer allocation and per-merge overhead costs more than Pendry's integer-specialized bucket scatter. That's the price of generality.
Type-Agnostic
The window detector and natural merge sort are comparison-based. They don't care what the data type is. The original Sieve Sort benchmarks verified this across int32, int64, double, and Record{key, payload}. Performance scales with the data pattern, not the data type. Total across all types: Sieve 36, qsort 0.
Sieve Sort proved the claim. The philosophy is separable from integer arithmetic. It works on anything you can compare.
Classification
| Property | Value |
|---|---|
| Category | Adaptive hybrid (detection + routing) |
| Best case | O(n), sorted input |
| Low cardinality | O(n + k), counting sort (int-specialized) |
| Structured | O(n log runs), natural merge |
| Worst case | O(n log n), full natural merge |
| Space | O(n) merge buffer |
| Stable | Yes |
| Data type | Any comparable type |
| Rank | #6 of 12 |
Previously: Where the Thread Ends
Next: The Pendry Sort Paper — The capstone. All five axioms. The full family on one table.