Olli Parviainen bf3cec0244 Improvements to help compiler autovectorization
Refactored FIRfilter and TDStretch hot-spot routines to help compiler
perform more efficient autovectorization.

Benchmarked:
- 2x/3x improvement in gcc-generated x86 SIMD code execution
  times for SSE2/AVX instruction extensions accordingly, when
  hand-tuned SSE intrinsics were disabled. Hand-tuned SSE code
  still is slightly faster than gcc-produced AVX.
- 2.4x improvement for cumulative ARM NEON tunings when compared to
  previous SoundTouch release.

Signed-off-by: Olli Parviainen <oparviai'at'iki.fi>
2020-10-13 20:46:23 +03:00
..
2020-06-21 20:38:00 +03:00
2018-07-27 12:26:56 -04:00