An assignment that I've just now completed requires me to create a set of scripts that can configure random Ubuntu machines as nodes in an MPI computing cluster. This has all been done and the nodes can communicate with one another properly. However, I would now like to demonstrate the efficiency of said MPI cluster by throwing a parallel program at it. I'm just looking for a straight brute force calculation that can divide up work among the number of processes (=nodes) available: if one node takes 10 seconds to run the program, 4 nodes should only take about 2.5.
With that in mind I looked for a prime calculation programs written in C. For any purists, the program is not actually part of my assignment as the course I'm taking is purely systems management. I just need anything that will show that my cluster is working. I have some programming experience but little in C and none with MPI. I've found quite a few sample programs but none of those seem to actually run in parallel. They do distribute all the steps among my nodes so if one node has a faster processor the overall time will go down, but adding additional nodes does nothing to speed up the calculation.
Am I doing something wrong? Are the programs that I've found simply not parallel? Do I need to learn C programming for MPI to write my own program? Are there any other parallel MPI programs that I can use to demonstrate my cluster at work?
EDIT
Thanks to the answers below I've managed to get several MPI scripts working, among which the sum of the first N natural numbers (which isn't very useful as it runs into data type limits), the counting and generating of prime numbers and the Monte Carlo calculation of Pi. Interestingly only the prime number programs realise a (sometimes dramatic) performance gain with multiple nodes/processes.
The issue that caused most of my initial problems with getting scripts working was rather obscure and apparently due to issues with hosts files on the nodes. Running mpiexec with the -disable-hostname-propagation
parameter solved this problem, which may manifest itself in a variety of ways: MPI(R) barrier errors, TCP connect errors and other generic connection failures. I believe it may be necessary for all nodes in the cluster to know one another by hostname, which is not really an issue in classic Beowulf clusters that have DHCP/DNS running on the server node.