0

I have a program (well, php script) which does some pretty heavy text searching - it loads a 2mb and 40mb file and searches through them to find where each word that appears in the first is present in the second.

I have a 4 core cpu (personal computer). When I start the process running, cpu usage jumps to 25%, load of 1. I start the process running again on a separate file, and cpu usage goes to 50%, load of 2. Does this reduce the efficiency of the individual processes? ie. making each one take longer to complete than if they were run separately? What about if I ran 4 processes, taking my cpu usage up to 100%? Would they run slower then?

I assume that running the two processes in parallel will complete faster than if I ran them in series, is this correct? Would it still be true if I ran more than two, say 3 or 4? Or more? Where is the bottleneck in this - I assume as long as I keep the number of processes to be equal or less than the number of cores, then the cpu can handle it, but what about memory access? Would the processes have to wait while reading memory?

Benubird
  • 523
  • 1
  • 5
  • 11

4 Answers4

1

Short answer: Benchmark it.

Long answer: Each individual process will take longer to complete (due to Frequency Scaling) but overall the most efficient thing to do is to load each core to 100%.

MikeyB
  • 39,291
  • 10
  • 105
  • 189
0

It depends on a lot of things, that running things in parallel will improve the performance or will degrade it. Like:

  1. If you are doing a lot of IO, using big files in the comparison, then the bottleneck will be the disk and and not the CPU and for surely, your performance will go down.
  2. At the same time, if your files are just big enough for the RAM you have in your system, and you run more than one processes, then the bottle neck will be RAM, and again there will be IO in the machine.

So, it goes with case to case scenario. But in your case, I am pretty sure that your performance will only improve if you run things in parallel, and I can't see a scenario, in which it might degrade your efficiency, unless I am missing some point, which I can't think of.

Napster_X
  • 3,373
  • 18
  • 20
0

Generally - yes. Ignore the coding part for a moment.

Modern multi core processors have a boost mode if only a small number of cores are used that will boost frequency a little. As such, using all cores makes the individual core smaller. Details depend on processor.

THAT SAID: The total will still be higher as the individual boost normally is VERY small (some hundred Mhz) compared to getting another core. As such you are really better off using all coers. The boost was mostly done for those cases that do not scale well and need a high per core freuqncy - which includes single threaded games ;)

To ask your question about memory access. I hope you are aware that a modern server has a memory access speed in exess of 50gb (that is gigabyte) per second from DRAM - more from caches. So, it is unlikels you hit this. IO may be a problem, but this will be visible by the CPU not maxing out and IO wait stats going up. Caching helps here, a lot.

TomTom
  • 51,649
  • 7
  • 54
  • 136
  • Can you expand on this? I doubt the cpu has a cache of 40mb, which means it is going to be needing to repeatedly read data from memory. Just to grab some arbitrary numbers - say it needs to read the whole of one file for every kb of the other - then even at 50gb/s, that's still 26 minutes (and it is actually taking much longer). So I guess my question really is, is that 50gb/s split between the cores? And can the cpu process faster than the memory can be retrieved? – Benubird Dec 17 '13 at 17:46
  • ??? NO, sorry, I dont teach computer basics here. And seriously, caches can be LARGE. 30 mb is possible today. Also, you say 26 minutes? THis is 26*60*50 = 78 TERAbyte of memory throughput? Sure your math is right? And OBVIOUSLY (for anyone knowing computer basics) memory bandwidth is shared. You think the memory slot magically doubles in bandwidth when you have twice the cores? Obviously, though, smart programming is the answer. Stuff like an index and full text search are not rocket science. They are baseline programming. Return program to programmer asking him to learn his job. – TomTom Dec 17 '13 at 18:26
0

It sounds like you don't have much I/O wait, on either the hard-drive or network. Assuming you have GBs of RAM, the 42MB files should easily be loaded into RAM. At that point, four parallel processes should give you the best results. You'll see minor context switching when normal OS processes need to run.

On NUMA systems, each core has an assigned memory pool. Performance can degrade when the kernel migrates a process to a different core because the files are still in the original core's memory pool. Honestly, I'm not sure that applies to personal computers.

skohrs
  • 1,520
  • 11
  • 23