0

I have a Multiobjective Particle Swarm Optimization algorithm for a complex problem, it uses a big population (4000 particles) and is a time consuming simulation (4 - 6 hours of execution).

As the algorithm keeps an archive, a repository of best solutions found so far, in order to analyze algorithm convergence and behavior I need to save some data from this repository and sometimes from the entire population at each iteration.

Currently in each iteration I'm (Java speaking) copying some attributes from the particle's object (from the repository and/or the population), formatting it to a StringBuffer in a method that runs in a separate thread from the simulation and, only at the end of the program execution I save it to a text file.

I think my algorithm is consuming memory in a bad way by doing this. But thinking also about performance I don't know what is the best way to save all these data: should I follow the same logic but save a .txt file each iteration instead of doing it by the end of the algorithm? Or should I save to a database? If so, should I save it in each iteration or at the end or another time? Or should I approach it differently somehow?

Edit: Repository data are often in a [5 - 10] MB range while the Population data occupies [100 - 200]MB memory. Every time I run the program I need about 20 simulations to analyze average convergence.

Jon Cardoso-Silva
  • 991
  • 11
  • 25
  • How much data we talking? 2 gigs, you better save it to a file. 100MB? You can get away with memory. 100K get out of town and stop being such a memory miser :P – thatidiotguy Dec 17 '12 at 20:17
  • True, that's important and I forgot to mention that :( Well, the repository data is only [5 - 10] MB for each simulation, but population data are about [100 - 200] MB per simulation. But each feature I add to the program, I need to run at least 20 simulations – Jon Cardoso-Silva Dec 17 '12 at 20:20
  • so once again, you have failed to tell us how much memory this "simulation data" is. I don't know what you are doing. Rattling off stuff about "features" and "repository data" is meaningless. Is your JVM crashing because you are running out of memory? How much memory does it already have allocated? How much memory does the machine have in total? Is the data of one simulation needed for another? i.e. once data is computed, is it even needed ever again by the program? If not, then writing to a file could be done on another thread as the next simulation begins... etc etc – thatidiotguy Dec 17 '12 at 20:24
  • Sorry. The "repository" is an archive, a list of best solutions found so far (depending on the algorithm implementation it can store 200 to 300 solutions). This list change each iteration and I need to save the lists from all iterations to analyze the convergence. The JVM is not crashing, but I wonder if this is the best approach or if there's another way of doing it more efficiently. The data is saved for later analysis, so it's not needed to another simulation. – Jon Cardoso-Silva Dec 17 '12 at 20:30
  • 2
    If your program does what it must do, and does it fast enough, where's the problem? What would you gain by doing it in a different way? – JB Nizet Dec 17 '12 at 20:36
  • 1
    I agree with JB. If it aint broke dont fix it. Unless you are actively trying to reduce memory footprint for some reason. If you are, then you trade off for speed in execution, and you are going to have to take time to write information to the hard drive which is very slow. – thatidiotguy Dec 17 '12 at 20:38
  • Hmm... that's true. That would only make it potentially slower for no reason. If I truly face any JVM crash, then I think of a different approach. Thank you guys a lot. – Jon Cardoso-Silva Dec 17 '12 at 20:42
  • 1
    Please don't use a StringBuffer if you can use a StringBuilder. BTW if you can save your information as bytes instead of a chars it will use half as much memory. You can buy 32 GB for less than $200. For simulations you usually want lots of memory. – Peter Lawrey Dec 17 '12 at 21:42

1 Answers1

1

StringBuffer uses an array to keep characters, which is continuous area of memory. Whenever it needs to be expanded it creates a new array which is twice bigger. Usually it's enough for most of applications, but if you think that this buffer can be really big and want to eliminate the overhead of managing continuous part of memory, you can replace it with lists of Strings (or StringBuffers). This will require more memory, but it doesn't require this memory to be continuous.

tcb
  • 2,745
  • 21
  • 20