I am a mechanical engineer by training, and work within a research environment mostly extending a large existing numerical code base of 25+ year old C. I have recently decided I would like to learn how to design a serious piece of scientific software from scratch.
I have spoken to a number of academics in the CS department at the university and it seems to be a commonly held belief that the people most likely to be building large scale numerical applications are in mechanical/chemical/biology departments. Equally, most of the people writing these applications have little or no training in software design principles.
As most engineers, I learn by doing, so I am about to set myself a task to complete the following: Develop an adaptive mesh scheme that locally refines/coarsens based on the location of an arbitrarily moving curve. Across this grid, solve the heat equation (or some other PDE).
Things that I would like to include:
- Parallel (I have brief experience with MPI, so probably stick with this) -- perhaps combine in OpenCL (no Nvidia cards around, so no CUDA)
- Combination of Python and C++ (script driven UI in Python, execution in C++)
- Object-oriented, design pattern based (one part I really want to learn)
- Unit testing framework (I have used gtest and will probably stick with this, but not sure how detailed to make the unit tests, I have read various differing pieces of advice for unit testing scientific code)
- Linux based -- don't care too much about portability at this stage
- Perhaps using Boost libraries
- Use HDF5 or VTK for saving results (I know VTK, but feel HDF5 is better suited)
- Profiled performance
Some questions I am trying to answer:
- This feels like a mammoth task, that is ok, but what is the general process for breaking it down? Do you start with basic infrastructure (MPI wrappers, matrix classes etc.), or do you start with high level interaction (the main controller, the UI etc.), or somewhere completely different?
- Does the paradigm of Python + C++ fit well with launching MPI on a cluster?
- I haven't found any books that deal with application design in a scientific context -- is it because it doesn't exist, or I'm not looking in the right place?
- I am well aware of the ideal 'get it running and then profile' way to optimisation, but I assume that some of the very basic design decisions made at the start will influence performance. What are the major gotcha's to be aware of for high level design of numerical code?
NB: I'm not sure if this question fits with the stackexchange format -- if not, I will happily rephrase...