I develop a Lattice Boltzmann (Fluid dynamics) code using F#. I am now testing the code on a 24 cores, 128 GB memory server. The code basically consists of one main recursive function for time evolution and inside a System.Threading.Tasks.Parallel.For loop for a 3D dimensional space iteration. The 3D space is 500x500x500 large and one time cycle takes for ever :).
let rec timeIterate time =
// Time consuming for loop
System.Threading.Tasks.Parallel.For(...)
I would expect the server to use all 24 cores i.e. to have 100% usage. What I observe is something between 1% - 30% usage.
And my questions are:
- Is F# an appropriate tool for HPC computations on such servers?
- Is it realistic to use up to 100% of CPU for a real world problem?
- What should I do to obtain a high speed up? Everything is in one big parallel for loop so I would expect that it is all what I should do...
- If F# is NOT an appropriate language, what language is?
Thank you for any suggestions.
EDIT: I am willing to share the code if anyone is interested to take a look at it.
EDIT2: Here is the stripped version of the code: http://dl.dropbox.com/u/4571/LBM.zip It does not do anything reasonable and I hope I have not introduced any bugs by stripping the code :)
The startup file is ShearFlow.fs and at the of the file bottom is
let rec mainLoop (fA: FArrayO) (mR: MacroResult) time =
let a = LBM.Lbm.lbm lt pA getViscosity force g (fA, mR)