0

I'm trying to figure out how I can use OpenMP's for reduction() equivalent in CUDA. I've done some research online, and none of what I've tried worked. The code:

    #pragma omp parallel for reduction(+:sum)
    for (i = 0; i < N; i++)
    {
        float f = ...  //store return from function to f
        out[i] = f;    //store f to out[i]
        sum += f;      //add f to sum and store in sum
    }

I know what for reduction() does in OpenMP....it makes the last line of the for loop possible. But how can I use CUDA to express the same thing?

Thanks!

pauliwago
  • 6,373
  • 11
  • 42
  • 52
  • There are some examples of doing a reduction in CUDA: http://people.maths.ox.ac.uk/gilesm/cuda/prac4/reduction.pdf – osgx Dec 10 '12 at 00:27

1 Answers1

0

Use Thrust, An STL inspired library that comes with CUDA. See the Quick Start Guide for examples on how to perform reductions.

Roger Dahl
  • 15,132
  • 8
  • 62
  • 82
  • I am using C though, not C++. Can I still use that? – pauliwago Dec 09 '12 at 23:59
  • 1
    CUDA C, written in `.cu` files, is much closer to C++ than C. You can't use Thrust directly in a `.c` file, but you can use it in a `.cpp` or `.cu` file and export functions to your C code with `extern "C"`. You can also switch from C to C++ by changing your file extensions and then just do any (usually minimal) tweaking required to get your project to compile with the C++ compiler. – Roger Dahl Dec 10 '12 at 00:05