0

I was checking the "An Even Easier Introduction to CUDA" by Nvidia. There is a sample code adding 2 arrays together, which is straightforward. The 2 arrays that are going to be added are allocated in the shared memory which makes sense and is understandable. However there is also a variable that stores the size of the array as an integer which is not allocated in the shared memory. All of these are passed to the kernel through the parameters. How is the kernel going to access the integer that holds the size? Why are those arrays allocated in the shared memory but the integer is not? Am I missing something about how the memory works? After hours of searching the web and SO there is no answer to this specific question. That's why I finally made an account to ask this.

Thanks to everyone in advance, I'm excited to get more into CUDA and parallel computing.

0 Answers0