Saturday, August 17, 2013

SIMD implementation of dot-product.

Dot product can be auto-vectorized only with fast-math or similar compiler option. fast-math allows compiler to use associative floating point transformations. Other math functions like exponent can be damaged consequently.
Possible solutions:
  1. Compile fast_math code from other program separately and then link it. This is easy solution. However this is a step back to C.
  2. To introduce a @fast_math attribute. This is hard to realize. But I hope this will be done for future compilers.
  3. Do vectorization yourself. In that case you need to realize SIMD accessory functions like unaligned load.

So I wrote implementations of dot product for real and complex numbers.


Dot product for real numbers:

Complex version is similar. The main loop:

There are gdc and ldc implementations, but ldc implementation was not tested. Source code is available at GitHub.


Processor Intel i5-4570, Haswell
Instruction set AVX2
System Ubuntu 13.10
Compiler GDC-4.8.1
Compiler flags -march=native -fno-bounds-check -frename-registers -frelease -O3