Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory? In case it is, is it just as easy as assigning the value to a cudaMalloc'ed area, or should I use a thrust::device_ptr?
2 Answers
Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory?
The short answer is no.
thrust reduce returns a quantity, the result of the reduction. This quantity must be deposited in a host resident variable:
Take for example reduce, which is synchronous and always returns its result to the CPU:
template<typename Iterator, typename T>
T reduce(Iterator first, Iterator last, T init);
Once the result of the operation has been returned to the CPU, you can copy it to the GPU if you like:
#include <iostream>
#include <thrust/device_vector.h>
#include <thrust/reduce.h>
int main(){
thrust::device_vector<int> data(256, 1);
thrust::device_vector<int> result(1);
result[0] = thrust::reduce(data.begin(), data.end());
std::cout << "result = " << result[0] << std::endl;
return 0;
}
Another possible alternative is to use thrust::reduce_by_key
which will return the reduction result to device memory, rather than copy to host memory. If you use a single key for your entire array, the net result will be a single output, similar to thrust::reduce

- 143,785
- 11
- 213
- 257
-
Good answer, if you really want to use reduction and get the result back to device memory, you can either use cuda npp library or build the reduction by yourself, look for reduction at cuda example. – TripleS Feb 15 '14 at 16:32
-
reduce_by_key with a constant_iterator is a great solution to the question, thank you – Michael Feb 07 '19 at 22:39
Yes, it should be possible by using thrust::reduce_by_key instead with a thrust::constant_iterator supplied for the keys.

- 128
- 3
- 3
-
I know this is old but thank you for this suggestion, that's a great solution to the question – Michael Feb 07 '19 at 22:39