2

I am presently learning CUDA and I keep coming across phrases like

"GPUs have dedicated memory which has 5–10X the bandwidth of CPU memory"

See here for reference on the second slide

Now what does bandwidth really mean here? Specifically, What does one mean by

  • bandwidth of the CPU
  • bandwidth of the GPU
  • bandwidth of the PCI-E slot the GPU's are fitted onto the motherboard. My background in computer architecute is very poor, so if someone can give a very simple explanation of these terms, it will be really helpful.

My very very limited understanding of bandwidth is the highest possible number of gigabytes that can be trasnferred per second from the CPU to the GPU. But that does not explain why we need to define three types of bandwidth.

smilingbuddha
  • 14,334
  • 33
  • 112
  • 189

3 Answers3

3

Bandwidth is the transfer speed between two given objects. The GPU memory bandwidth is the maximum amount of data transfer that can occur between the GPU chip and its dedicated memory. CPU memory bandwidth is the maximum amount of data that can be transferred between the CPU and system memory. PCI-E bandwidth is the maximum amount of data that can be transferred between the South Bridge chip and the specific PCI-E device.

Of course, if the GPU requires data that is in system memory, then the fastest it can receive it is of the slowest link in the chain. All of this depends on what needs the memory and which nodes are required to retrieve it.

Matthew
  • 24,703
  • 9
  • 76
  • 110
3

There are three different memory buses in a current CPU/GPU system with discrete GPU:

  1. the GPU (aka "device") memory bus that connects the GPU to its own RAM.
  2. the CPU (aka "host" or "system") memory bus that connects the CPU to its own RAM.
  3. the PCI-e bus, which connects the CPU chipset to its peripherals, including the GPU.

Each of these buses has a physical bus width (in bits), a clock speed (how many times per second the data signals on the bus can be changed), and bandwidth (aka throughput), in bits per second (which can be converted to gigabytes per second). The peak bandwidth is determined by the bus width multiplied by the clock rate of the bus. Achievable bandwidth must also take into account any overhead (e.g. PCI-e packet overhead).

http://en.wikipedia.org/wiki/Bandwidth_(computing).

harrism
  • 26,505
  • 2
  • 57
  • 88
1

Bandwidth is the rate data can be transferred anywhere.

"GPUs have dedicated memory which has 5–10X the bandwidth of CPU memory"
means that the INTERNAL memory bandwidth between components on the GPU is much higher than for moving data between the main memory and the GPU, so once your data is on the card any copies are very-very fast.

Typically even on a low end CUDA card the internal bandwidth will by 30-50Gb/s while the actual achievable bandwidth over the PCI-E slot to main memory might be <1Gb/s.

Martin Beckett
  • 94,801
  • 28
  • 188
  • 263