2

I am using a library that was created to perform a simulation based on input data, with a single entry point, something like Run(Data data).

However unfortunately the library internally stores values as static members (I don't know why, but I can't change it) so that issues arise when attempting to perform multiple simulations at the same time as the multiple threads are affecting the same data internally.

The reason why I want to run multiple simulations at the same time is to give users the ability to specify a range of values and have all their output aggregates and presented in a comparable format.

Originally I thought the easiest way to get around this issue would be to write a simple console application which could be spawned as a separate process to perform the calculation and dump the result. However a large amount of data needs to be loaded into memory for the simulation to run and spawning separate processes means that this data needs to be loaded multiple times and is a great deal slower then running the simulations sequentially and is likely to hog a few gigabytes of memory.

So basically I am looking for a way to create local storage for each thread, if I could modify the library I would be looking at code like this:

[ThreadStatic]
public static int Foo { get; set; }

Is there a way to specify a assembly/static class declaration, without modifying it, to use local thread storage? Or maybe a way to even efficiently create multiple references to the same assembly at runtime?

Alex Hope O'Connor
  • 9,354
  • 22
  • 69
  • 112

1 Answers1

1

You're on the right track with a separate console application, but since the start-up cost is so high, you need a more complicated approach.

Instead of spawning a new console application each time, create a pool of child processes which you can communicate with (either via stdin/stdout or some other inter-process communication like wcf, remoting, or named pipes). Create a wrapper/manager class that keeps track of these processes, spawns new ones as needed, and knows which processes are in use. When a process is not in use, it can send it a new call and wait for the result.

You could also do the same thing in memory by loading the library multiple times into separate AppDomains, but I personally think separate processes is easier and safer.

Samuel Neff
  • 73,278
  • 17
  • 138
  • 182
  • Thanks heaps for the advice, going to do some reading on both approaches, could you please elaborate on the performance of using WCF for inter-process communication on the same machine? Also what are the main safety concerns with loading and unloading new application domains? – Alex Hope O'Connor Jul 04 '13 at 03:32
  • 1
    @AlexHopeO'Connor, I wouldn't worry about performance of the IPC. It sounds like the process you're running will take significantly longer than any associated IPC. You should think about maintainability though. MS definitely recommends WCF for all communications, both inter-process and even across AppDomains, but I've found remoting to be far easier and if the data your sending is simple than named pipes or stdin/out would be equally easy. – Samuel Neff Jul 04 '13 at 03:43
  • 1
    I'm not aware of safety concerns with multiple app domains. They're very commonly used to isolate third party code and plugins. I personally feel it's easier to manage multiple processes, but I have only limited experience working with app domains (in unit testing context). – Samuel Neff Jul 04 '13 at 03:45
  • I am going to go with the application domains approach, because after doing some testing WCF is a little bit too slow when trying to share the data through the named pipes and without shared memory each process still needs to have the entire set of data loaded rather then just the changed parameters for the simulation. – Alex Hope O'Connor Jul 04 '13 at 06:02