Who can tell me how the gfor structure allocate CUDA threads? As we know, in Arrayfire, We can use gfor for parallel computation. But, the threads of CUDA are limited, so I want to know,how can I improve my Arrayfire code, Should I redesign the gfor structure according to the GPU hardware architecture.
Asked
Active
Viewed 139 times
0
-
Usually there are a few arrayfire folks who look at this tag from time to time, like @Pavan, so hopefully someone with some expertise will comment. You mention "threads of CUDA are limited", are there some specific limitations you are concerned about (e.g. number of threads, registers per thread, etc.)? – Robert Crovella Mar 27 '13 at 17:44
-
This question is more appropriate for our forums. It has been asked and has received a reply over here http://forums.accelereyes.com/forums/viewtopic.php?f=17&t=25496 – Pavan Yalamanchili Apr 03 '13 at 19:46