5

For using data parallel algorithms on the GPU with CUDA there are two standard libraries, CUDPP and Thrust, which implement sorting, reduction , prefix sum etc.

So what are the main differences between the libraries, in terms of performance and features ?

smilingbuddha
  • 14,334
  • 33
  • 112
  • 189
  • 1
    I believe this question deserves a more serious answer; but I will suggest you expand it to also include [cub](http://nvlabs.github.io/cub/) - which I also believe is faster than the other two for some/all computational tasks. – einpoklum Oct 30 '16 at 19:28

1 Answers1

3

I have used both for sorting and prefix sums about a year ago (with CUDA 4.1, but I can't remember the versions of Thrust and CUDPP) and I experienced that CUDPP is a little bit faster but Thrust is easier to use (using float-array with about 20M entries).

As for the features, as far as I can recall, you can use Thrust also with host memory not only with device memory (as opposed to CUDPP), but this might be outdated.

kroneml
  • 677
  • 3
  • 16