creating arrays in nvidia cuda kernel

Question

hi I just wanted to know whether it is possible to do the following inside the nvidia cuda kernel

__global__ void compute(long *c1, long size, ...)
{
  ...
  long d[1000];
  ...
}

or the following

__global__ void compute(long *c1, long size, ...)
{
  ...
  long d[size];
  ...
}

score 12 · Accepted Answer · answered Feb 02 '10 at 22:03

12

You can do the first example, I haven't tried the second.

However, if you can help it, you might want to redesign your program not to do this. You do not want to allocate 4000 bytes of memory in your kernel. That will lead to a lot of use of CUDA local memory, since you will not be able to fit everything into registers. CUDA local memory is slow (400 cycles of memory latency).

answered Feb 02 '10 at 22:03

tkerwin

9,559
1
31
47

2

A combination of the answers on this question is valuable. In particular, note Sebastian Dressler's point that every thread will allocate those 4000 bytes. So if you launch a grid of 1024 blocks of 256 threads each, that's about 1GB of memory required. – harrism Jul 11 '12 at 03:00

score 11 · Answer 2 · answered Feb 10 '10 at 17:04

11

You can do #1, but beware this will be done in EVERY thread!

Your second snippet won't work, because dynamic memory allocation at kernel runtime is not supported.

answered Feb 10 '10 at 17:04

Sebastian

8,046
2
34
58

score 9 · Answer 3 · edited Jul 10 '12 at 19:57

9

You can allocate shared memory dynamically when you launch the kernel.

__global__ void compute(long *c1, long size, ...)
 {
  ...
   extern __shared__ float shared[];
  ...
 }

compute <<< dimGrid, dimBlock, sharedMemSize >>>( blah blah );

CUDA programming guide:

the size of the array is determined at launch time (see Section 4.2.3).

edited Jul 10 '12 at 19:57

SomeKittens

38,868
19
114
143

answered Feb 08 '10 at 23:57

Juan Leni

6,982
5
55
87

score 7 · Answer 4 · answered Apr 29 '12 at 19:52

7

dynamic memory allocation at kernel runtime is supported, check the sdk example , new delete.

answered Apr 29 '12 at 19:52

adnan ozsoy

71
1
1

creating arrays in nvidia cuda kernel

4 Answers4