1

I’m evaluating Intel IPP to speed up certain parts of our code, e.g.,

among others. I note this page in the manual:

While the rest of Intel IPP functions support only signals or images of 32-bit integer size, Intel IPP platform-aware functions work with 64-bit object sizes if it is supported by the target platform. … You can distinguish Intel IPP platform-aware functions by the L suffix in the function name, for example, ippiAdd_8u_C1RSfs_L. With Intel IPP platform-aware functions you can overcome 32-bit size limitations.

Of the three I mentioned above, it appears only sorting has 64-bit-aware functionality.

So, questions: can this be right? Can IPP not accelerate addition/abs on arrays beyond 32-bit indexing? Is there a master list of functions that have “platform-aware” (64-bit) alternatives in IPP? Do people hand-roll workarounds to the 32-bit limit, like calling the add/abs functions in a loop over 2^30-sized chunks?

Ahmed Fasih
  • 6,458
  • 7
  • 54
  • 95
  • 1
    If you're performing multiple operations on your data then you probably want to strip-mine it anyway, with a chunk size *much* smaller than 2^30 elements. – Paul R Jan 10 '18 at 13:04
  • @PaulR got it. Does the advice for small chunks hold even if I won't be doing multiple-operations, i.e., just add or abs or conj on an array? – Ahmed Fasih Jan 10 '18 at 13:12
  • 1
    No, no need to strip-mine if it's just a single operation and there is no possibility of combining it with other operations. If this is the case with simple operations like add/abs though then you will most likely be memory bandwidth-limited and will not see a significant improvement from using optimised routines. – Paul R Jan 10 '18 at 13:14

0 Answers0