Properly design a parallel, performance critical numerical code

Question

I am a mechanical engineer by training, and work within a research environment mostly extending a large existing numerical code base of 25+ year old C. I have recently decided I would like to learn how to design a serious piece of scientific software from scratch.

I have spoken to a number of academics in the CS department at the university and it seems to be a commonly held belief that the people most likely to be building large scale numerical applications are in mechanical/chemical/biology departments. Equally, most of the people writing these applications have little or no training in software design principles.

As most engineers, I learn by doing, so I am about to set myself a task to complete the following: Develop an adaptive mesh scheme that locally refines/coarsens based on the location of an arbitrarily moving curve. Across this grid, solve the heat equation (or some other PDE).

Things that I would like to include:

Parallel (I have brief experience with MPI, so probably stick with this) -- perhaps combine in OpenCL (no Nvidia cards around, so no CUDA)
Combination of Python and C++ (script driven UI in Python, execution in C++)
Object-oriented, design pattern based (one part I really want to learn)
Unit testing framework (I have used gtest and will probably stick with this, but not sure how detailed to make the unit tests, I have read various differing pieces of advice for unit testing scientific code)
Linux based -- don't care too much about portability at this stage
Perhaps using Boost libraries
Use HDF5 or VTK for saving results (I know VTK, but feel HDF5 is better suited)
Profiled performance

Some questions I am trying to answer:

This feels like a mammoth task, that is ok, but what is the general process for breaking it down? Do you start with basic infrastructure (MPI wrappers, matrix classes etc.), or do you start with high level interaction (the main controller, the UI etc.), or somewhere completely different?
Does the paradigm of Python + C++ fit well with launching MPI on a cluster?
I haven't found any books that deal with application design in a scientific context -- is it because it doesn't exist, or I'm not looking in the right place?
I am well aware of the ideal 'get it running and then profile' way to optimisation, but I assume that some of the very basic design decisions made at the start will influence performance. What are the major gotcha's to be aware of for high level design of numerical code?

NB: I'm not sure if this question fits with the stackexchange format -- if not, I will happily rephrase...

This is too broad and vague a question for SO, but *could* be a fit for http://progragrammers.stackexchange.com. I've flagged it, requesting a migration. Rule of thumb: coding and got stuck? Post here. Still at the whiteboard designing and want feedback? Ask on Programmers instead. — Martijn Pieters, Dec 21 '12 at 15:54
@MartijnPieters Cool, thanks. I guess I will need an account over there too then? — BrT, Dec 21 '12 at 15:55
Yes you will need an account, but the accounts can be linked together through the same signup procedure. — sean, Dec 21 '12 at 15:57
Yes, you'd need an account there too. If you use the same login then your accounts can be associated, and if the question is migrated it'll automatically be assigned to your account there. — Martijn Pieters, Dec 21 '12 at 15:57
This is going to sound very arrogant from my part, but I feel the need to say it. You are trying to build a flying hospital without having built an one dormitory house, that will obviously cause a lot of doubts and the end result will most likely be useless. Start small, acquire experience, grow. — mmgp, Dec 21 '12 at 16:00
@mmgp Agreed, and not arrogant. The question that I am actually trying to solve in my head is this: many numerical researchers face a problem like this at least once in their early career, but very very few are equipped with the skills needed to think about it. Equally, they don't always have the time to start small and build up. This is why much academic code is horrendously difficult to maintain and extend. I am hoping to use my experience of jumping in the deep end to produce a survival guide for the lab I work in... — BrT, Dec 21 '12 at 16:06
One main difference from the typical civil engineering that we cannot ignore, is that software development is a very subjective area. You cannot use calculus to establish foundations requirements, and etc. Instead, it is all dependent on previous experiences. — mmgp, Dec 21 '12 at 16:10
I asked the moderators on Programmers if they wanted me to migrate this, and they agreed "not in its current form." You're free to repost over there, but please check out their [FAQ](http://programmers.stackexchange.com/faq) first, and try to reduce the scope of your question a little. — Bill the Lizard, Dec 21 '12 at 16:35

Properly design a parallel, performance critical numerical code

0 Answers0