CUDA Global Memory, Where is it?

Question

I understand that in CUDA's memory hierachy, we have things like shared memory, texture memory, constant memory, registers and of course the global memory which we allocate using cudaMalloc().

I've been searching through whatever documentations I can find but I have yet to come across any that explicitly explains what is the global memory.

I believe that the global memory allocated is on the GDDR of graphics card itself and not the RAM that is shared with the CPU since one of the documentations did state that the pointer cannot be dereferenced by the host side. Am I right?

This is probably what you are looking for: http://stackoverflow.com/questions/8684770/how-is-cuda-memory-managed?rq=1 — Doug, Oct 29 '12 at 18:37

score 4 · Accepted Answer · answered Jun 24 '12 at 16:02

4

Global memory is a virtual address space that can be mapped to device memory (memory on the graphics card) or page-locked (pinned) host memory. The latter requires CC > 1.0.

Local, constant, texture, and local memory are allocated in global memory but accessed through different address spaces and caches.

On CC > 2.0 the generic address space allows mapping of shared memory into the global address space; however, shared memory always resides in per SM on-chip memory.

answered Jun 24 '12 at 16:02

Greg Smith

11,007
2
36
37

Note that page locked memory can be mapped (kernel memory requests traverse the PCIe bus to the CPU's RAM chips) or unmapped (the memory must first be copied to the physical off-die RAM chips on the graphics card before it's accessible to a kernel). See section 3.2.4 of the CUDA C Programming Guide. – Mr Fooz Jun 24 '12 at 16:56

score 2 · Answer 2 · answered Jun 24 '12 at 15:30

2

Global memory is off-chip but on the graphics card.

Local memory is stored in global memory but addresses are interleaved in such a way that when arrays are store there, accesses are coalesced when each thread in the warp reads from the same index in its array.

Constant and texture memory is also (initially) stored in global memory, but it is cached in on-chip caches.

Shared memory and the L1 and L2 caches are on-chip.

answered Jun 24 '12 at 15:30

Roger Dahl

15,132
8
62
82

Global memory can be in device memory or pinned host memory. – Greg Smith Jun 24 '12 at 16:07
1

@GregSmith: Pinned host-memory can be mapped into the memory space of the device, but I don't think it is referred to as global memory then. The CUDA Programming Guide 4.2, section 5.3.2.1 states that, "Global memory resides in device memory", though that's kind of a weird sentence in itself :) – Roger Dahl Jun 24 '12 at 16:14

Pedro · Answer 3 · 2012-06-24T16:20:59.843

0

This is discussed in Section 3.2.2 of the CUDA C Programming Guide. In short, all types of memory, i.e. shared, constant, texture and global, reside in the memory of the device, i.e. the GPU, itself.

You can, however, specifically declare parts of memory to be "Mapped", i.e. memory on the host to be accessible from the device. For this, see Section 3.2.4 of the Programming Guide.

edited Jun 24 '12 at 16:20

answered Jun 24 '12 at 15:12

Pedro

1,344
9
17

Yes, I came across the host and device memory thing. However a peer of mine interprets global memory as being allocated on the main memory, which I don't think is the case. – gamerx Jun 24 '12 at 15:21
1

@gamerx: Your peer may have gotten confused with Mapped memory, described in Section 3.2.4.3 of the Programming guide. – Pedro Jun 24 '12 at 15:40
1

This is incorrect. Constant, texture, and global can reside in both device and pinned host memory. The developer only has control of allocating portions of global in device memory using cudaHostAlloc and cudaHostRegister. The rest is up to the driver. – Greg Smith Jun 24 '12 at 16:06
@GregSmith: This doesn't seem to be what is implied in Section 3.2.4. From what I understand there, the developer has to explicitly aloocate memory as `cudaHostAllocMapped` for this to happen. – Pedro Jun 24 '12 at 16:18

CUDA Global Memory, Where is it?

3 Answers3

Linked