I'm in a situation where I have to perform some linear algebra calculations with a matrix that almost never changes and a lot of small vectors ( very very few 3x3 or 4x4 matrices and vectors with 3 values ) in C++, I was thinking about using some CPU instructions set for x86 32 bit, x86 64 bit, ARMv5 and above to speed up things and simplify the design of my math operations.
Surprisingly I haven't found a real set for linear algebra, most of them are for floating point math, cached, optimized as you want, but nothing really for matrices and linear algebra, is that just me or there is no set for linear algebra ?
The new FMA3 from AMD looks interesting to start with, but it's still really too rare to find in modern CPUs, I would like to stick to something as popular as the SSE on the x86 or the ARMv5 on ARM.
So there is a popular instruction set for small and quick linear algebra computations ? I could even accept a good amount of errors if the speed is good enough.
EDIT:
I should also note that in practice my compilers are:
- gcc
- mingw
- Visual Studio
so I would like to have an open source product and a portable library on both x86 and ARM.
EDIT 2: Eigen doesn't support multithreaded execution, it's a big down for me.