13

In a multi-CPU machine, do the different CPUs compete for the same memory bandwidth, or do they access DRAM independently?

In other words, if a program is memory bandwidth limited on, say, a 1-CPU 8-core system, would moving to a 4-CPU 4*8-core machine have a chance to speed it up (assuming the CPUs and DRAM are comparable)?

MWB
  • 11,740
  • 6
  • 46
  • 91

4 Answers4

5

The answer to your main question is: Depends. What does it depend on? It depends on the camp your set up falls in, and technically speaking there are two.

In the first camp, which is known as Shared-Memory Multicore, the answer to your question would be "Yes". With this model, if you will, multiple processors having multiple cores share memory by way of a common bus (which is where you would get your bottleneck) and other than that, there is nothing connecting the CPUs together. This is the category/camp/ model where the typical consumer-grade computer falls in.

In the second camp, known as Distributed-Memory Multicore, the answer to your question is "No". This hardware-setup-scenario means that each processor has its own private memory but there is a bus connecting the processors together directly. The protocol for making this scenario possible is Message Passing Interface. This also means that the group of CPUs physically don't have to be in the same box or room as the RAM they access. You probably won't find this kind of set up in a home. Think research facilities, labs, universities, mid-large businesses etc..

To answer your second question. The answer is also depends. And it depends because one needs to know if the program was written to make use of parallelism with the system's parallel execution possibilities. Although your consumer-grade computer having one or two processors shares a single memory bus, if the program was written with parallelism in mind then you will notice a performance increase. Otherwise, serial instructions coming from a program will be executed serially on just one core.

If you are into the nitty-gritty of multi-core processing, and how memory is accessed via a program, a good "gateway resource" to expand your cranium on is Flynn's Taxonomy). Just Google-ing it will take you down the rabbit hole, if you are interested.

Edit: To give credit where credit is due, I highly recommend Professional Parallel Programming in C# by Gaston C. Hillar. This delightful book has been the most revealing on the topic of parallelism for me in my short career. It helps clear the muddy water on distinctions between Parallel Programming and Multi-core Programming and the types of multi-core processing I've just mentioned, complete with diagrams!

Isaiah Nelson
  • 2,450
  • 4
  • 34
  • 53
  • 1
    *the answer to your question would be "Yes"* I believe this is incorrect, in general, depending on the hardware. With NUMA, different CPUs can have separate DRAM access channels. – MWB May 07 '13 at 03:39
  • *other than that, there is nothing connecting the CPUs together* This is also incorrect (See Csaba's answer). – MWB May 07 '13 at 03:51
2

Yes, all the CPUs compete for the same bandwidth. There's only one hardware connection from the CPU chip to the RAM so all accesses must go through it.

The different levels of CPU cache may be shared or not to alleviate this problem. Only cache misses need to go to the RAM itself. See http://en.wikipedia.org/wiki/CPU_cache#Multi-core_chips

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • *only one hardware connection from the CPU chip to the RAM* -- My question is more about multiple CPUs rather than multiple cores on the same CPU. It's not clear to me if your answer applies to them. – MWB May 06 '13 at 22:12
  • @MaxB, you're right that I didn't fully understand the question. Multiple CPU chips may have independent access to different banks of memory, but I'm not familiar enough with those setups to say anything more. – Mark Ransom May 06 '13 at 22:49
  • The question concerns multiple CPUs, not multiple cores. – paul-g Mar 11 '15 at 13:59
2

Do multiple CPUs compete for the same memory bandwidth?

Not necessarily. Non-Uniform Memory Access and multi-channel memory architecture can result in higher total memory bandwidth than what would have been achievable with a single CPU.

MWB
  • 11,740
  • 6
  • 46
  • 91
  • If you are attempting to answer your question, you are providing an answer to an entirely different question and not one you've asked on this thread. You simply asked 1) Do multiple processors compete for the same memory and 2) Would a program benefit by adding more processors. To which I have given you two perfectly valid answers. You, conversely, have stated that the two sources you provide offer higher memory bandwidth to a single CPU. Again this is not an answer to your own question but an entirely new one. Please provide an edit if you wish to change your question – Isaiah Nelson May 07 '13 at 03:11
0

If you use relatively new hardware and your software's memory limitation partially comes from CPU-to-CPU communication, then you have a good chance that you can scale reasonably. Older x86 SMP architectures used one front-side bus (FSB), and each CPU could share data with the other only using that single front-side bus. With the Opteron server processors-line the CPU cores were also connected to other CPU cores individually by dedicated Hyper-Transport lines. This caused Opteron servers to scale much better than the Intel server at that time. But since than many years ago Intel hired those engineers who developed the Hyper-Transport for AMD (and for history those engineers used their experiences gained from Alpha EV6 bus) developed also a scalable CPU to CPU SMP links architecture called QPI for Intel. So today's Intel server products are also more scalable than the old FSB PCs. If you on a non-x86 server land you probably also have some architecture which is scalable that way. In that case if you have such software which need interaction between cores, that can significantly speed those up.

Csaba Toth
  • 10,021
  • 5
  • 75
  • 121