M.Algorithms

Sieve Sort

April 6, 2026 · 3 min read · Pendry, S

Sorting · Part 6 of 7

Window detection, measurement-first routing, and the proof that the philosophy works on any data type. (#6 of 12 in the unified benchmark)

Pendry Sort is fast because it routes to non-comparison paths. Counting sort doesn't compare. Bucket scatter doesn't compare. Those paths depend on integer arithmetic. They can't handle floats, strings, structs, or anything with a custom comparator.

That limitation bothered me. The core philosophy (observe the data before choosing a strategy, extract everything from every touch) doesn't depend on integers. It's a design principle. It should work on any data type.

Sieve Sort is the test of that claim.

The question: Pendry Sort handles only int32. Can the philosophy survive without integer arithmetic? Sieve Sort answers yes: window detection + natural merge sort. Comparison-based. Type-agnostic. Optional counting sort path for integers.

The Window Detector

Before you sort anything, figure out what needs sorting.

Two passes. O(n) total. O(1) space.

Forward pass. Walk left to right tracking the running max. Every time an element is smaller than the running max, it's out of place. Mark that position as the right edge of the dirty window. Also track the global min and max.

Backward pass. Walk right to left tracking the running min. Every time an element is larger than the running min, mark that position as the left edge.

Result: everything outside the window is confirmed correct. Only the window needs work. If the forward pass never finds anything out of place, the array is already sorted. Cost: n−1 comparisons.

Pure form: The tenet from Post 1 in its purest expression. No windows of ±8. No inversion counting. No samples. Just: where does disorder start, and where does it end?

Three Paths

Tiny window (≤32 elements). Insertion sort. Always optimal at this size.

Low cardinality (value range ≤ window/4, range ≤ 1M). Counting sort. O(n + k). Integer-specialized. A generic comparator version skips this.

Everything else. Natural merge sort. Find ascending runs. Reverse descending runs in place. Extend short runs to minimum length 32. Merge bottom-up. On structured data: O(n log runs). On random: O(n log n).

Three paths. One handoff each. Earlier prototypes had inversion counting, threshold-based routing, and multiple fallbacks. They added complexity and created failure modes. Removing them was the fix.

Timsort family: Sieve Sort sits alongside Timsort and Smoothsort. Its distinguishing feature: the window detector as a preprocessing pass. Timsort runs directly into run detection. Sieve Sort identifies the dirty region first, then looks for runs only within it.

Sieve Sort v3: Unified Benchmark at 1M Elements

35 patterns. 1M int32 elements. 7 trials, median timing. Handles every pattern with zero N/A and zero FAIL.

Where Sieve Sort Wins

Pattern	Sieve-v3	Pendry	introsort	qsort
Outlier	6.1ms	127.1ms	122.5ms	78.5ms
Two Blocks	7.1ms	19.3ms	17.8ms	81.9ms
Scatter 5%	7.2ms	7.8ms	18.5ms	55.8ms
Scatter 10%	7.2ms	7.7ms	18.4ms	55.4ms

Where It Pays the Cost of Generality

Pattern	Sieve-v3	Pendry	Gravity
Random	102.5ms	10.6ms	10.6ms
Organ+Noise	172.7ms	10.0ms	10.3ms
Zipf-like	97.6ms	10.0ms	9.9ms

The cost: natural merge sort's O(n) buffer allocation and per-merge overhead vs. Pendry's bucket scatter which bypasses comparison entirely on integer data.

Aggregate

Metric	Value
Total time	810,240 µs (#6 of 12)
Wins	4 of 35
N/A + FAIL	0 + 0
vs qsort	3.4x faster aggregate
vs introsort	2.0x faster aggregate

Reading the Numbers

Look at what Sieve Sort wins. Outlier: 6.1ms vs Pendry's 127.1ms. That's a 20x advantage. The outlier pattern (one element displaced far from its origin) is Pendry Sort's worst pattern in the entire benchmark. Pendry's merged scan detects high inversions and bails to bucket scatter, which scatters everything across 125,000 buckets for a single misplaced element. Sieve Sort's window detector identifies the exact dirty region, finds it small, and insertion-sorts just that section.

Two Blocks: 7.1ms vs Pendry's 19.3ms. Sieve's window detector identifies the entire array as dirty, then natural merge sort finds two long ascending runs and merges them in one pass.

The losses are real too. On random data (102.5ms vs Pendry's 10.6ms), on organ+noise (172.7ms vs 10ms), on Zipf (97.6ms vs 10ms). Natural merge sort's buffer allocation and per-merge overhead costs more than Pendry's integer-specialized bucket scatter. That's the price of generality.

Type-Agnostic

The window detector and natural merge sort are comparison-based. They don't care what the data type is. The original Sieve Sort benchmarks verified this across int32, int64, double, and Record{key, payload}. Performance scales with the data pattern, not the data type. Total across all types: Sieve 36, qsort 0.

Sieve Sort proved the claim. The philosophy is separable from integer arithmetic. It works on anything you can compare.

Classification

Property	Value
Category	Adaptive hybrid (detection + routing)
Best case	O(n), sorted input
Low cardinality	O(n + k), counting sort (int-specialized)
Structured	O(n log runs), natural merge
Worst case	O(n log n), full natural merge
Space	O(n) merge buffer
Stable	Yes
Data type	Any comparable type
Rank	#6 of 12

Previously: Where the Thread Ends

Next: The Pendry Sort Paper — The capstone. All five axioms. The full family on one table.