← blog

IronHollow 0.5: SIMD FFT and convolution reverb

2026-03-18 · Sam Redmond

IronHollow 0.5 is the most significant release since the project started — SIMD-accelerated FFT and a full convolution reverb implementation.

SIMD FFT

The previous FFT was a clean Cooley-Tukey implementation with no platform-specific code. It was correct and readable but left a lot of performance on the table. Version 0.5 adds AVX2 and SSE4.1 kernels dispatched at runtime via CPUID, with a fallback to the scalar path for older hardware.

On a Zen 3 machine, a 1024-point complex FFT goes from 4.2 µs to 0.9 µs. The benchmarks are in the repository; they run with cargo bench.

Convolution reverb

Partitioned overlap-add convolution lets you apply an impulse response of arbitrary length without the latency that block convolution would otherwise require. The implementation follows Gardener's 1995 partitioning scheme, adapted for Rust's ownership model. Partition sizes default to the FFT size used for analysis; they're configurable if you're targeting a specific hardware latency budget.

A small set of test impulse responses is bundled in the repository under test_data/irs/.


Also: Zero-copy audio pipelines in Rust: the Arc<[f32]> pattern