memory management of results of CUDA thrust::minmax_element with pair of device_ptr return type

Question

I have a point cloud in device memory in dPointsWS with a memory layout where first all x-, then all y- and lastly all z-coordinates are stored. I use thrust to compute a tight axis aligned bounding box (AABB) of this point cloud. Here is my code:

// use CUDA thrust library for AABB computation
thrust::pair<thrust::device_ptr<Real>, thrust::device_ptr<Real>> thrustAABB[3];

// do parrallel min_max reduction on GPU for each coordinate axis
thrust::device_ptr<Real> dPointsWS(mDPointsWS);
for (uint32 i = 0, offset = 0; i < 3; ++i, offset += mPointCount)
    thrustAABB[i] = thrust::minmax_element(dPointsWS + offset,
                                           dPointsWS + offset + mPointCount);
cudaDeviceSynchronize();

// get results from the GPU
for (uint32 i = 0; i < 3; ++i)
{
    mAABBWS[2 * i + 0] = *thrustAABB[i].first;
    mAABBWS[2 * i + 1] = *thrustAABB[i].second;
}

What I am wondering about is where the result of thrust::minmax_element is stored before the last code block. I have clearly downloaded the results to host memory at the end, but I would like to avoid this. I've found the following article: thrust reduction result on device memory. However, my case is different since I use the return type thrust::pair<thrust::device_ptr<Real>, thrust::device_ptr<Real>>.

As the reduction function returns a pair of device_ptrobjects, the minimum and maximum results should be stored on the GPU or do I misunderstand this? But if the results are stored on the GPU, how can I control their lifetime. For example, I would like to directly use the results for AABB drawing with OpenGL without downloading them to host memory.

score 0 · Accepted Answer · answered Dec 16 '14 at 17:48

The minimum element and the maximum element found by thrust::minmax_element are resident in your mDPointsWS array (or whatever array that is pointing to; you haven't shown a full example). The thrust operation does not move any data or store any numerical min/max results anywhere. It simply returns two (device_ptr) pointers (in this case), both of which effectively point to positions within your mDPointsWS array, or whatever underlying array allocation is referenced by mDPointsWS. One points to the position of the max element in that array. The other points to the position of the min element in that array (i.e. within the range of offset and offset+mPointCount).

Therefore, the "lifetime" of the "results" is simply the lifetime of the underlying array referenced by mDPointsWS (which presumably you allocated, and therefore you should know and be able to control its lifetime). The "storage" of the "results" is on the GPU in this case -- right where you put them in the mDPointsWS array -- they haven't moved anywhere.

The "lifetime" of the pointers is just the lifetime of the thrustAABB array, which presumably you also created, allocated, and can control the lifetime of.

Thank you very much! This answers my question. I didn't make my case completely clear. I know how to control the memory allocation of dPointsWS. My wrong assumption that thrust stores the found results somewhere on the GPU in some extra space confused me. But the results are only pointed to by the returned pairs and these pointers directly refer to elements in dPointsWS. — Samir, Dec 17 '14 at 00:33

memory management of results of CUDA thrust::minmax_element with pair of device_ptr return type

1 Answers1