1

I know that calls to a functor using thrust::for_each with data in thrust::host_vector's have a parallel execution policy, but do they actually execute in parallel?

If not, what would be the correct way to invoke these knowing that the system I'm running this on is virtualized so that all cores appear to be on the same machine?

[EDIT]

I realize that there is such a thing as thrust::omp::par, however, I can't seem to be able to to find a full Thrust example using OpenMP.

A.I.
  • 89
  • 1
  • 9

1 Answers1

2

In general, thrust operations dispatched on the "host" are not run in parallel. They use a single host thread.

If you want to run thrust operations in parallel on the CPU (using multiple CPU threads) then the recommended practice would be to use the thrust OpenMP backend.

A fully worked example is here.

Another worked example is here.

Community
  • 1
  • 1
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • I have a follow up question: So if I were to use a (custom) functor, would define it as `__device__` or `__host__`? – A.I. Nov 07 '16 at 16:08
  • It seems that it should be `__host__`, even though either should work – A.I. Nov 07 '16 at 16:18
  • `__device__` and `__host__` are not the same as the thrust backend. For thrust "host" operations, the functor must include `__host__` decoration. For thrust "device" backend operations which use the GPU as the backend, the functor must include the `__device__` decoration. For all CPU-based backends (including OMP), the functor must include the `__host__` decoration. The reason for this is that `__host__` and `__device__` don't mean exactly the same thing as thrust "host" and thrust "device" backend. – Robert Crovella Jan 01 '17 at 00:02