7

I have a multi-process .NET (F#) scientific simulation running on Windows Server 2008 SE and 64 processors. Each time step of the simulation oscillates from 1.5 sec to 2 sec. As each process must wait for other processes, the overall speed is the speed of the slowest process (2 sec * number of iterations). Therefore, I need to reduce the oscillation of the processes as much as possible.

Is there any way how to force a set of processes to have the exactly same "computational time" available for their computations?

Security Hound
  • 2,577
  • 3
  • 25
  • 42
Oldrich Svec
  • 4,191
  • 2
  • 28
  • 54
  • 1
    Is there any way you can use multiple threads inside one process instead of multiple processes? – Alex Moore Nov 01 '11 at 14:36
  • `same "computational time" available ` Yes, only when it is running on your custom OS which doesn't have any other process/services etc – Ankur Nov 01 '11 at 15:53
  • Alex Moore: I need to use multiple processes. Ankur: I have 64 cores. The system should use 1 core and leave the rest for the computations... – Oldrich Svec Nov 01 '11 at 16:12
  • Is the variation due to CPU time stepping or due to variations in computational strength itself? Some time steps may just take longer. – Sebastian Good Nov 01 '11 at 16:51
  • Well, for my testing purposes, the steps and work are completely identical on all the cores. – Oldrich Svec Nov 02 '11 at 08:00

3 Answers3

1

Is it possible for you to paralelize the 2 second series, so that you have multiple "branches" of the simulation occuring in parallel?

Example: Suppose that this is 1 simulation with 4 processes. Process 1 takes 2 seconds, so you cannot finish until process 1 completes.


process1---------------------------------------------- (2 sec)
process2-------- (0.5 sec)
process3---- (0.25 sec)
process4---------------------------- (1 sec)

You have a lot of idle time in there where most of your processes are waiting on process 1.
For the work you are trying to do, is it feasible to have more than 1 of these sets running at the same time? If so, then you could utilize your idle cores by working on other simulations while they are waiting for your longer running process to finish.

JMarsch
  • 21,484
  • 15
  • 77
  • 125
1

I do not know how you can ask the OS to try to schedule your processes more fairly but I do know that there is a lot of research on techniques that avoid the architecture you are using precisely because this lowest-common-denominator effect is a major bottleneck in practice.

My favorite paper on this subject is The cache complexity of multithreaded cache oblivious algorithms by Frigo and Strumpen. They describe fascinating techniques such as space-time subdivision that turn a bulk-parallel computation such as the one you describe into an arbitrarily fine-grained asynchronous computation that makes load balancing effortless.

J D
  • 48,105
  • 13
  • 171
  • 274
  • The best way would be to dedicate 1 processor to 1 process. Is that possible? – Oldrich Svec Nov 01 '11 at 16:14
  • You might be able to pin a process to a processor but that doesn't solve the problem that some processors will complete before others because other work is being done on them (e.g. by the OS itself) or because they have experienced more cache misses etc. – J D Nov 01 '11 at 18:45
0

I'm not sure I understand 100% what you want to do. But for inter-process synchronization you can use a named EventWaitHandle or Semaphore.

Update per comment

You can use ProcessorAffinity to constrain processes to specific processors.

Daniel
  • 47,404
  • 11
  • 101
  • 179
  • This is not what I mean. In another words, I need to dedicate 1 processor to 1 process or similar. So that all my processes have equal "power/speed". – Oldrich Svec Nov 01 '11 at 16:16
  • ProcessorAffinity is the closest to the solution, so I will mark it as the answer. But I have read a lot of warnings not to use ProcessorAffinity so probably I will stick with the system to decide the best for me. – Oldrich Svec Nov 02 '11 at 10:05