5

If i is a random walk like below (each index is not unique), and there is a device vector A filled with zeros.

{0, 1, 0, 2, 3, 3,  ....}

Is it possible that thrust can make A[i] auto-increment, after the operation A may look like

//2 means appears count of 0's
//1 means appears count of 1's
//1 means appears count of 2's
//2 means appears count of 3's
{2, 1, 1, 2}

I had tried several cases, but these case only works fine when A is a host vector, I guess that because thrust do the parallel, that it previous result can't affect the new one, the result may look like //only count once no matter the index appear how many times {1, 1, 1, 1}

Could thrust achieve my goal with device vector A and a random walk index vector?

Vitality
  • 20,705
  • 4
  • 108
  • 146
user1995868
  • 233
  • 1
  • 2
  • 11

1 Answers1

3

If you are seeking histogram calculations by Thrust, then you may wish to note that there is a Thrust documentation example providing two different algorithms:

  1. Dense histogram, using sort to sort the array, then using upper_bound to determine a comulative histogram and finally using adjacent_difference to calculating the histogram;
  2. Sparse histogram, using sort to sort the array and then reduce_by_key, as mentioned by @Eric in his comment.

From this two threads

I would say that the above are the only two methods to achieve an histogram using Thrust. I have timed both the approaches on a Kepler K20c card and these have been the timing:

  • N=1024*16; # bins = 16*16; Dense = 2.0ms; Sparse = 2.4ms;
  • N=1024*128; # bins = 16*128; Dense = 3.4ms; Sparse = 3.1ms;

On accounting for the fact that the timing does depend on the input array, I would say that the results do not appear to be dramatically different.

It should be noticed that the CUDA samples provide a histogram calculation example, but it is optimized for 64 or 256 bins, so it is not homogeneous to the above mentioned Thrust codes.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
Vitality
  • 20,705
  • 4
  • 108
  • 146