3

I have a GTX570 with 2Gb of memory, when I try to allocate more memory with one cudamalloc call than about 804Mb I get into to trouble. Anyone any ideas to why that is? It is my first call so I doubt it is fragmentation.

No problem:

Memory avaliable: Free: 2336116736, Total: 2684026880
requesting 804913152 bytes
no error
Memory avaliable: Free: 1531199488, Total: 2684026880
requesting 804913152 bytes
no error
Memory avaliable: Free: 726286336, Total: 2684026880

Problem:

Memory avaliable: Free: 2327601152, Total: 2684026880
requesting 805306368 bytes
out of memory
Memory avaliable: Free: 2327597056, Total: 2684026880
requesting 805306368 bytes
out of memory
Memory avaliable: Free: 2327597056, Total: 2684026880
Aktaeon
  • 189
  • 2
  • 14
  • An item in the release notes at http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html, section "Known Issues", subsection "Windows", second bullet, may apply: With the WDDM driver under Windows, the maximum size of a device allocation can be limited by the size of system memory. – njuffa Dec 13 '12 at 07:30
  • I am running on windows 8 x64 with 12GB of RAM and about 9GB free and running cuda 5. Good idea I will check the release notes too. – Aktaeon Dec 13 '12 at 09:50

1 Answers1

1

This is caused restrictions imposed by the Windows WDDM subsystem. There is a hard limit imposed on how much memory can be allocated, calculated as

MIN ( ( System Memory Size in MB - 512 MB ) / 2, PAGING_BUFFER_SEGMENT_SIZE )

For desktop windows PAGING_BUFFER_SEGMENT_SIZE is about 2Gb IIRC. You have two options to work around this:

  1. Get a Telsa card and use the dedicated Windows TCC mode driver which takes memory management of the device away from WDDM, eliminating the restriction.
  2. Install linux or use a CUDA aware live distribution for your GPU computing. The Linux driver has no restrictions on memory allocations beyond the free memory capacity of the device.
talonmies
  • 70,661
  • 34
  • 192
  • 269
  • Hmm thanks for pointing this out, see that many others have the same problem. However, having about 12GB RAM of system memory the limiting factor should be PAGING_BUFFER_SEGMENT. If the 2GB is a correct number my code should be able to allocate a lot more than just the 840MB. Or am I missing something? – Aktaeon Dec 13 '12 at 12:26
  • 1
    According to the formula, it should be possible to allocate 2GB on any machine with more than 4.5GB memory (given a PAGING_BUFFER_SEGMENT_SIZE of 2GB). If you are adventurous, you can modify your card to [get around NVIDIA's artificial limitation of TCC mode on non-Tesla cards](https://devtalk.nvidia.com/default/topic/489965/cuda-programming-and-performance/gtx480-to-c2050-hack-or-unlocking-tcc-mode-on-geforce/). – Roger Dahl Dec 14 '12 at 02:15