9

I do some explicitly vectorised computations using SSE types, such as __m128 (defined in xmmintrin.h etc), but now I need to raise all elements of the vector to some (same) power, i.e. ideally I would want something like __m128 _mm_pow_ps(__m128, float), which unfortunately doesn't exist.

What is the best way around this? I could store the vector, call std::pow on each element, and then reload it. Is this the best I can do? How do compilers implement a call to std::pow when auto-vectorising code that otherwise is well vectorisable? Are there any libraries that provide something useful?

(note that this question is related by not a duplicate and certainly doesn't have a useful answer.)

Community
  • 1
  • 1
Walter
  • 44,150
  • 20
  • 113
  • 196
  • I've used http://gruntthepeon.free.fr/ssemath/ for `exp/log` and write `pow(x,k)` as `exp(k*log(x)` when auto-vectorisation was not an option. Not sure how it compares with auto-vectorized code. – SleuthEye Sep 19 '14 at 14:47
  • 3
    You could use Agner Fog's vector class. He has SIMD math functions (including pow, exp, log, sin,...) for SSE, AVX, and AVX512 for single and float and ints. I don't see any good reason to use Intel's SVML or AMD's libm anymore. – Z boson Sep 20 '14 at 15:01
  • @Zboson, Is there a good C library for `exp()` with SSE4 support? – Royi Oct 30 '17 at 19:00

4 Answers4

9

Use the formula exp(y*log(x)) for pow(x, y) and a library with SSE implementations of exp() and log().

Edit by @Royi: The above holds only for cases both x and y are positive. Otherwise more carefull Math is needed. See https://math.stackexchange.com/questions/2089690.

Royi
  • 4,640
  • 6
  • 46
  • 64
Jasper Bekkers
  • 6,711
  • 32
  • 46
  • I had a look at that library. It looks restricted to gcc, only knowns about SSE2, and the documentation in the code is poor. I also would want it for the AVX types `__m256` __m256d`. – Walter Sep 19 '14 at 15:11
  • @Walter works fine with MSVC (note benchmarks with VS2010 at the bottom of the link), and code becomes more clear when looking at [cephes library](http://www.netlib.org/cephes/cmath.tgz) which seems to be the main inspiration. – SleuthEye Sep 20 '14 at 00:21
  • I need it work wit gcc, icc, clang. The original cephes library is great! If there is nothing better, I can at least implement my own log and exp along the lines of these libraries. – Walter Sep 20 '14 at 09:01
  • @Walter Did you even try to compile the library I've linked? There are about 5 compiler specific lines that and they should work with all the compilers you've mentioned. – Jasper Bekkers Sep 23 '14 at 15:13
  • Sorry, had other things to do in the last couple of days. But this library does not support all the vector types I need. – Walter Sep 24 '14 at 16:02
  • Using this formula, how does one handle negative numbers ? – Baptiste Wicht Sep 22 '17 at 13:31
2

I really recommend the Intel Short Vector Math Library for these types of operations. The library is bundled with the Intel compiler which you mention in the list of compilers to support. I doubt it would be useful for gcc and clang but it could serve as a reference point for benchmarking wherever pow implementation you come up with.

https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-DEB8B19C-E7A2-432A-85E4-D5648250188E.htm

Magnus
  • 643
  • 8
  • 15
  • 2
    SVML can be useful with gcc. `gcc -mveclibabi=svml` will even let the vectorizer create calls to vmlsPow4 and such. – Marc Glisse Apr 11 '15 at 06:27
  • @MarcGlisse, Does `gcc` include Intel SVML built in? – Royi Oct 30 '17 at 18:30
  • 1
    gcc does not include SVML, it only has the knowledge of how to generate calls to it, if you promise that you will have it available for linking. – Marc Glisse Oct 30 '17 at 18:33
2

An AVX version of the ssemath library is now available: http://software-lisc.fbk.eu/avx_mathfun/

with the library you can use:

exp256_ps(y*log256_ps(x)); // for pow(x, y)
Pietro
  • 788
  • 9
  • 8
  • Yes, these provide the log, exp, sin, cos, and a sincos function for 8 `float`s using AVX. Unfortunately, the corresponding `double` versions are still outstanding (I actually needs those more at the moment). – Walter Mar 29 '17 at 19:50
  • You could try the Intel SPMD compiler: http://ispc.github.io/ispc.html the documentation says it supports pow and AVX – Pietro Mar 31 '17 at 07:35
-2

Make a vector out of the float.

 _mm_pow_ps(v,_mm_ps1(f))
  • 1
    There is no `_mm_pow_ps()`, I'm afraid. Otherwise, I had not asked. – Walter Apr 10 '15 at 11:47
  • Ah, I misunderstood the question. Taylor series is traditional, as noted earlier http://gruntthepeon.free.fr/ssemath/ is a good resource. Depending on how important accuracy is you can make the number of terms much lower. – Johan Köhler Apr 12 '15 at 23:18