1

I wrote a Java program that draw the Mandelbrot image. To make it interesting, I divided the for loop that calculates the color of each pixel into 2 halves; each half will be executed as a thread thus parallelizing the task. On a two core one cpu system, the performance of using two thread approach vs just one main thread is nearly two fold. My question is on a two dual-core processor system, will the parallelized task be split among different processor instead of just utilize the two core on one processor? I suppose the former scenario will be slower than the latter one simply because the latency of communicating between 2 CPU over the motherboard wires.

Any ideas?

Thanks

nobody
  • 2,709
  • 6
  • 35
  • 37

2 Answers2

2

Which processor (or core) a thread is run on is something that is dependent on the operating system. I don't think the OS generally makes any significant distinction between multi-CPU or multi-core systems, so programs on a single proc system with 4 cores would be scheduled the same way as a system with 2 dual core processors.

Generally my experience has been that the threads will be more or less evenly distributed over all the available processors. So if you were to watch a CPU graph of your program running on a system with 4 cores, you would see roughly 25% utilization on each core. You can set thread affinity to a specific CPU/core on most operating systems, but I'm not sure if that functionality is available in Java.

Eric Petroelje
  • 59,820
  • 9
  • 127
  • 177
1

If I understand your description correctly, you have only 2 threads. It is not possible to utilize 4 cores simultaneously with 2 threads. Ideally you want at least as many threads as there are cores in the system. With the non-uniform cost distribution of the Mandelbrot Set (computation is more expensive for points in the set than those outside the set) the optimal number of threads may be higher (I would try 4× the number of cores.)

I divided the for loop that calculates the color of each pixel into 2 halves

I am not sure what you mean here, but you should probably divide the outermost loop (that iterates through Y coordinates) between threads. That will reduce the likelihood of two or more CPUs contending for the same cache line (assuming the image is rendered in row-major order.)


Note: Runtime.getRuntime().availableProcessors will tell you how many cores the system has.

finnw
  • 47,861
  • 24
  • 143
  • 221
  • I used one dimensional array iters[width * height] (I admit nesting for loop is superior) to store how many iterations to reach the end point. So if `height` is 500, then first thread would calculate row 0 to 250 exclusive, second thread goes from 250 to 500. – nobody Dec 22 '10 at 20:22