2

I have a number of jobs to execute. Each job consists of a buffer write, a kernel execution and a buffer read and those operations must be of course executed in order. The various jobs are however indipendent and can therefore be executed concurrently.

Is there any performance difference between using multiple in-order command queues (like one would do with CUDA streams) and a single out-of-order one, with equivalent synchronization? Which is better?

Shepard
  • 801
  • 3
  • 9
  • 17

1 Answers1

0

Some implementations don't support out-of-order command queues.

Based on your description I'd use multiple out-of-order queues. Using a single out-of-order queue would required events to synchronize within a virtual queue, which is extra work for you.

Dithermaster
  • 6,223
  • 1
  • 12
  • 20
  • Yes but how many command queues? Even with the extra effort due to the manual "batches" creation, I find more trivial to use a single out-of-order queue. But only if it is supported and the performances won't suffer.. – Shepard Mar 26 '16 at 20:28
  • > how many command queues One for each in-flight job (you can re-use them from completed jobs to new jobs). But you find it more trivial to use a single out-of-order queue then try it. Verify all the device you want to run on support it. – Dithermaster Mar 27 '16 at 14:48
  • Yes that is a good idea, to reuse old queues once available. I would like to use a single queue yes, but I was curious how it would perform in comparison of multiples queues. I think I'm gonna do some tests. – Shepard Mar 29 '16 at 12:06