2

I am using Apple's Accelerate Framework, and namely vDSP, in order to perform several subsequent matrix & vector operations.

When does the CPU gather/copy the memory from the GPU?

Does it happen after every vDSP function call?

If not, is there a way to 'force' the gathering operation explicitly?

Hatchmaster J
  • 543
  • 1
  • 6
  • 15
  • Accelerate currently does not use the GPU. If you believe the GPU is still involved, you'll have to provide more context to help us understand how. – Ian Ollmann Jan 08 '15 at 21:23
  • Thanks @IanOllmann! I encountered some phenomenas while debugging my accelerated code where calculation results became garbage, and these phenomenas disappeared when I added debug prints, so I guessed it was due to implicit memory management. – Hatchmaster J Jan 11 '15 at 07:52
  • Are there any plans for the accelerate framework to take advantage of GPUs? – Hatchmaster J Jan 11 '15 at 07:53
  • 2
    @HatchmasterJ: It doesn't really make sense to use the GPU for operations at the level at which (most of the) Accelerate APIs live; you'd end up spending as much time moving the data between CPU and GPU as you currently do executing the operation on the CPU. You really want to move entire computations over to the GPU, rather than individual operations. – Stephen Canon Jan 12 '15 at 20:15
  • 2
    Even if you had Accelerate interfaces running on the GPU, within a matter of weeks you'd realize you really wanted Accelerate interfaces to be asynchronous and to be enqueuable onto a OpenCL cl_command queue or a Metal MTLCommandQueue so that you didn't have to synchronize the GPU pipeline with the CPU after each operation. Likewise, you'd want to replace flat pointers with cl_mem / MTLBuffer so that data stays on the GPU. It is certainly possible to construct a high performance Accelerate-like API for the GPU, but it wouldn't be the Accelerate interface you have today. Some would translate. – Ian Ollmann Jan 20 '15 at 01:05

0 Answers0