14

I am trying to implement an OpenMP version of the 2-dimensional n-body simulation.

But there is a problem: I assume each particle's initial velocity and acceleration are zero. When the particles first gather together, they would disperse out in a high speed, and do not gather again.

This doesn't seem to be consistent with the Newton Law, right?

Can someone explain why that happens and how I can fix the error?

Here is part of my code:

/* update one frame */
void update() {
    int i, j;

    omp_set_num_threads(8);
    # pragma omp parallel private(j)
    {

    # pragma omp for schedule(static)
        for ( i = 0; i < g_N; ++i ) {
            g_pv[i].a_x = 0.0;
            g_pv[i].a_y = 0.0;
            for ( j = 0; j < g_N; ++j ) {
                if (i == j)
                    continue;
                double r_2 = pow((g_pv[i].pos_x - g_pv[j].pos_x),2) + pow((g_pv[i].pos_y - g_pv[j].pos_y),2);
                g_pv[i].a_x += (-1) * G * g_pv[j].m * (g_pv[i].pos_x - g_pv[j].pos_x) / (pow(r_2 + e,1.5));
                g_pv[i].a_y += (-1) * G * g_pv[j].m * (g_pv[i].pos_y - g_pv[j].pos_y) / (pow(r_2 + e,1.5));             
            } 
            g_pv[i].v_x += period * g_pv[i].a_x;
            g_pv[i].v_y += period * g_pv[i].a_y;
        }
    # pragma omp for schedule(static)
        for ( int i = 0; i < g_N; ++i ) {
            g_pv[i].pos_x += g_pv[i].v_x * period;
            g_pv[i].pos_y += g_pv[i].v_y * period;
        }
    }
}

Don't worry about the OpenMP, just treat it as an sequential version. OpenMP would not affect the outcome much.

Edit: to clarify, here is the entire code (there might be some errors in this part, but the problem I described should occur in the above code section)

# include <iostream>
# include <fstream>
# include <iomanip>
# include <cmath>
# include <vector>
# include <cstdlib>
# include <omp.h>

# include <GL/glew.h>
# include <GL/freeglut.h>
# include <GL/gl.h>

using namespace std;

/* the size of the opengl window */
# define WIDTH 2000
# define HEIGHT 2000

/* define the global constants */
const double G = 6.67 * pow(10, -11);
// const double G = 6.67;
const double e = 0.00001;
const double period = 1;

/* define the structure of particle */
struct particle
{
    double m;
    double pos_x;
    double pos_y;
    double v_x;
    double v_y;
    double a_x;
    double a_y;

    particle(double m = 0, double pos_x = 0, double pos_y = 0, 
            double v_x = 0, double v_y = 0, double a_x = 0, double a_y = 0)
    {
        this->m         = m;
        this->pos_x     = pos_x;
        this->pos_y     = pos_y;
        this->v_x       = v_x;
        this->v_y       = v_y;
        this->a_x       = a_x;
        this->a_y       = a_y;
    }
};

/* define the global data */
int g_N;                        // number of particles
vector<particle> g_pv;          // particle vector

void setUp();
void update();
void display();

int main(int argc, char ** argv) {

    /* set up the window */
    glutInit(&argc, argv);
    glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
    glutInitWindowSize (WIDTH, HEIGHT);
    glutInitWindowPosition (100, 100);
    glutCreateWindow ("openmp");

    /* initialize */
    setUp();

    glutDisplayFunc(display);
    glutMainLoop();

    return 0;
}

/* read the input data */
void setUp() {
    glMatrixMode (GL_PROJECTION);
    glLoadIdentity ();
    /* Sets a 2d projection matrix
     * (0,0) is the lower left corner (WIDTH, HEIGHT) is the upper right */
    glOrtho (0, WIDTH, 0, HEIGHT, 0, 1);
    glDisable(GL_DEPTH_TEST);

    ifstream inFile;
    inFile.open("input_25.txt");

    inFile >> g_N;
    g_pv.resize(g_N);
    for ( int i = 0; i < g_N; ++i )
    {
        inFile >> g_pv[i].m >> g_pv[i].pos_x >> g_pv[i].pos_y
               >> g_pv[i].v_x >> g_pv[i].v_y >> g_pv[i].a_x >> g_pv[i].a_y; 
    }
    inFile.close();
}

/* display in openGL */
void display(void) {
    glClear(GL_COLOR_BUFFER_BIT);
    for(int i = 0; i < g_pv.size(); ++i) {
        /* Get the ith particle */
        particle p = g_pv[i];

        /* Draw the particle as a little square. */
        glBegin(GL_QUADS);
        glColor3f (1.0, 1.0, 1.0);
        glVertex2f(p.pos_x + 2, p.pos_y + 2);
        glVertex2f(p.pos_x - 2, p.pos_y + 2);
        glVertex2f(p.pos_x - 2, p.pos_y - 2);
        glVertex2f(p.pos_x + 2, p.pos_y - 2);
        glEnd();
    }   
    update();
    glutPostRedisplay();
    glFlush ();
}

/* update one frame */
void update() {
    int i, j;

    omp_set_num_threads(8);
    # pragma omp parallel private(j)
    {
        /* compute the force */
        # pragma omp for schedule(static)
            for ( i = 0; i < g_N; ++i ) {
                g_pv[i].a_x = 0.0;
                g_pv[i].a_y = 0.0;
                for ( j = 0; j < g_N; ++j ) {
                    if (i == j)
                        continue;
                    double r_2 = pow((g_pv[i].pos_x - g_pv[j].pos_x),2) + pow((g_pv[i].pos_y - g_pv[j].pos_y),2);
                    g_pv[i].a_x += (-1) * G * g_pv[j].m * (g_pv[i].pos_x - g_pv[j].pos_x) / (pow(r_2 + e,1.5));
                    g_pv[i].a_y += (-1) * G * g_pv[j].m * (g_pv[i].pos_y - g_pv[j].pos_y) / (pow(r_2 + e,1.5));             
                } 
                g_pv[i].v_x += period * g_pv[i].a_x;
                g_pv[i].v_y += period * g_pv[i].a_y;
            }

        /* compute the velocity */
        # pragma omp for schedule(static)
            for ( int i = 0; i < g_N; ++i ) {
                g_pv[i].pos_x += g_pv[i].v_x * period;
                g_pv[i].pos_y += g_pv[i].v_y * period;
            }
    }
}
Z boson
  • 32,619
  • 11
  • 123
  • 226
user3558391
  • 161
  • 2
  • 7
  • 2
    Your `j` loop appears to be accessing array elements that have not yet been initialized by your `i` loop. Maybe run the entire `i` loop to initialize before doing the double loop with `i` and `j`? – Galik Dec 15 '14 at 08:55
  • Where do you include the mass of `g_pv[i].m`? After you loop over `j` shouldn't you do `g_pv[i].a_x *= g_pv[i].m` and `g_pv[i].a_y *= g_pv[i].m`? – Z boson Dec 15 '14 at 12:21
  • 4
    Decrease the complexity of your code and it will be less error prone. – Daniel Daranas Dec 15 '14 at 14:22
  • @Zboson: the mass is not there because this is *acceleration*, not force. – Rufflewind Dec 15 '14 at 21:07
  • 6
    @user3558391: I suspect the problem here is at least partly due to the inaccuracy of the integration method that you're using (first-order [Euler method](https://en.wikipedia.org/wiki/Euler_method)). Have you considered using [Verlet integration](https://en.wikipedia.org/wiki/Verlet_integration) instead? When the particles collapse onto each other, it reaches a singularity in the potential so there's a possibility for large numerical errors to occur there. – Rufflewind Dec 15 '14 at 21:20
  • When I placed a particle in a stable circular orbit the results seem to be pretty close to what I would expect (about as good as I would expect for a first-order Euler method), so I suspect this is mostly due to numerical errors. – Rufflewind Dec 15 '14 at 21:30
  • @Rufflewind, yeah, I just realized that: `a=F/m`. – Z boson Dec 16 '14 at 09:01
  • The point of using OpenMP is for optimization. If you're having problems with a single threaded code then you should fix that first. Additionally, you can optimize your single thread code as well. There is no reason to calculate the force twice as `F_i,j = -F_j,i`. This will cut down your iterations by a factor of 2 (from `n^2` iterations to `n(n-1)/2`). Get your single thread code working correctly first, then optimize it, then optimize with OpenMP. – Z boson Dec 16 '14 at 09:53
  • @Z boson. If delete all the #pragma, it would become sequential version. I've tried the single thread version. It didn't work either. – user3558391 Dec 16 '14 at 21:37
  • 3
    I know, that's my point. Before worrying about OpenMP get the sequential version working correctly. – Z boson Dec 17 '14 at 08:46
  • 2
    I made the following input file: `2 100000000 500 500 0 0 0 0 1 530 500 0 0.005 0 0` It behaves exactly as I would expect it to with one planet orbiting another. Now if `0.005` is replaced by `0`, the lighter planet will collide with the more massive one. This is to say that the force will eventually become almost infinite, leading to blowing up (see e.g. http://www.gromacs.org/Documentation/Terminology/Blowing_Up). Add repulsion at short ranges (say Lennard-Jones type), and things will probably look better. – alarge Dec 17 '14 at 14:26
  • @alarge, if you have a good answer please write one up. I don't have time to look into this due to end of the year deadlines but it's an interesting question. – Z boson Dec 17 '14 at 14:43
  • 1
    *I assume each particle's initial velocity and acceleration are zero.* You are doing something very, very bad by making all your particles have an initial velocity of zero. Your system has zero angular momentum and this makes your system singular. The solution to your problem is the same solution the doctor gives to the patient who tells the doctor, *Doctor, it hurts when I do this! «bonk»*: Don't do that then. – David Hammen Dec 18 '14 at 22:35
  • @DavidHammen, yes but even if the we used only 2 point like bodies one with much a larger mass than the other and we gave the small mass a small tangential velocity but still zero radial velocity then the orbit would avoid the singularity but the OPs method would still give bad results I think. – Z boson Dec 19 '14 at 12:55

2 Answers2

8

I'm expanding my comment to an answer (as suggested by Z boson) with several suggestions as to how you might address the problem.

This question really belongs to the Computational Science.SE, as I don't think there is anything wrong with the code per se, but rather the algorithm seems faulty: As the particles get close you can get a force of G / pow(e,1.5) ~ G * 1e7. This is large. Very, very large (compared to your time step). Why? Suppose you have two planets, one massive sitting at (0, 0) and a small one at (0, 1), where the latter gets a very large acceleration towards the former. On the next step the small planet will be at (0, -100), or whatever, and the force on it zero, which means that it will never return and now has a considerable velocity, too. Your simulation has blown up.

As you said, this does not jive well with Newton's laws, so this indicates that your numerical scheme has failed. Worry not, there are several ways to fix this. You already anticipated as much, as you added e. Just make it larger, say 10, and you ought to be fine. Alternatively, set a very small timestep, period. You could also just not compute the force if the particles get too close or have some heuristic as to what should happen if planets collide (maybe the explode and vanish, or throw an exception). Or have a repulsive force say with r - 2 potential or something: g_pv[i].a_y += (-1) * G * g_pv[j].m * (g_pv[i].pos_y - g_pv[j].pos_y) * (1 / pow(r_2 + e,1.5) - 1 / pow(r_2 + e,2.5)); This is similar to how the phenomenological Lennard-Jones interaction incorporates the repulsion arising from the Pauli exclusion principle. Note that you can increase the sharpness of the repulsion (change 2.5 to 12.5, if you like) which means that the repulsion has less of an effect far away (good), but that it needs to be resolved more accurately, leading to a smaller period (bad). Ideally you would start with an initial configuration that does not lead to collisions, but this is nigh-impossible to predict. In the end, you'll probably want to use a combination of the methods that were listed above.

Using OpenMP might lead to a slight speed-up, but you should really be using algorithms for long-range forces, such as Barnes-Hut simulation. For more, see for example the 2013 Summer School on Fast Methods for Long Range Interactions in Complex Particle Systems and the booklet (available for free) they published on the most recent developments. You also might not want to display every time step: Scientific simulations usually save every 5000th step or so. If you want nice looking movies, you can interpolate with some moving averages to smooth out the noise (your simulation does not have temperature or anything of the sort, so you will probably be fine without averaging). Also I'm not sure if your data structures are optimized for the job or if you might be running into cache misses. I'm not really an expert, so maybe someone else might want to weigh in on this. Finally, consider not using pow, but rather the fast inverse square root or similar methods.

Community
  • 1
  • 1
alarge
  • 2,162
  • 2
  • 13
  • 14
  • I agree there is no problem in the OP's implementation of the method used but what the OP expects from the method. But I think the OP's question is within the scope of the physics tag (and OpenMP is a major red herring). – Z boson Dec 18 '14 at 11:46
  • You could explain why "this does not jive well with Newton's laws" (e.g. conservation of energy is violated and why). Your thought experiment with a large and small mass is good. You could also give the small mass a tiny tangential velocity then it would avoid the singularity. You can show analytically that it will never obtain escape velocity but with the OP's method it could obtain escape velocity violating conservation of energy. – Z boson Dec 18 '14 at 11:55
  • It seems I'm wrong about the escape velocity. It could have a [hyperbolic trajectory](https://en.wikipedia.org/wiki/Hyperbolic_trajectory) (it's been a long time since I thought about such things). But anyway, I think you could could construct it so that it had an orbit and after the OPs method it would escape the orbit when it should not. – Z boson Dec 18 '14 at 14:46
  • @Zboson There is actually no singularity in the OP's equations per se (thanks to the use of a small `e` to avoid the division by zero). The equations can, however, get "close enough" to a singularity that things can blow up with the used timestep (`period`) for reasons I discussed in the "thought experiment". This is to say that I don't think the issues have anything to do with numerical floating point precision, but rather the accuracy of the algorithm, if you will. This is why I did not explicitly mention the former in the answer. – alarge Dec 18 '14 at 15:05
  • I agree the issues have little to do with floating point precision. That's why I added the numerical-methods tag and not the floating point tag – Z boson Dec 18 '14 at 16:50
  • The reason I suggested using a tiny tangential velocity was to get away from using a `e` fudge factor. Using a tangential velocity avoids the singularity by making the trajectory 2D instead of 1D. But now I realized that a tiny tangential velocity probably means the trajectory is hyperbolic and not an orbit. – Z boson Dec 18 '14 at 16:52
  • I don't disagree with anything in your answer I was just trying to push your to be more precise with "this does not jive well with Newton's laws". There are probably several ways to show that. – Z boson Dec 18 '14 at 16:54
  • Actually, I think a tiny tangential velocity and zero radial velocity (initial condition) will produce an orbit. Hyperbolic probably requires some radial velocity. – Z boson Dec 18 '14 at 17:13
  • 1
    @Zboson The thing is that especially when there are several planets, you can't assume much about the stability of orbits and so on. Yes, for the 2 planet case you are relatively safe dropping the "fudge factor" if you make sure to have enough tangential velocity so as the planets will not get too close. Now with 25 planets (say), there are no guarantees and you'll have to address this somehow (one way to do this would be to use `e`). – alarge Dec 19 '14 at 12:38
  • I agree, I was referring to your your two body example. It should be possible to show that the Euler method fails to get even the two body problem right with certain conditions. If it get's 2 wrong it will get more than 2 wrong as well. – Z boson Dec 19 '14 at 12:47
  • 2
    @Zboson - The code in the opening post is using a symplectic integrator. Basic Euler uses x(t+Δt)=x(t)+v(t)Δt, v(t+Δt)=x(t)+a(x(t))Δt. [Symplectic Euler](http://en.wikipedia.org/wiki/Semi-implicit_Euler_method) uses v(t+Δt)=x(t)+a(x(t))Δt, x(t+Δt)=x(t)+v(t+Δt)Δt. The OP uses the latter. The problem isn't lack of symplecticity; it's the low order nature of the technique. – David Hammen Dec 19 '14 at 13:15
4

The first evident problem is that you are using a simple "Taylor integration scheme". The point is that since you are approximating the infinitely small dt, to a finite time difference Δt that is your period, you are expanding the well known motion equations in a Taylor expansion, with infinite terms.... that you are truncating.

In general:

xt+Δt = xt + xt Δt + 1/2 xt Δt2 + 1/6 x′″t Δt3 + O(Δt4)

being xt the first derivative at time t, the velocity v; xt the second derivative, the acceleration a,...; O(Δt4) is the order of the error - in this example, we are truncating the expansion at the 3rd order, and we have a local error of the 4th.

In your case you are using (Euler method):

xt+Δt = xt + vt Δt + O(Δt2)

and

vt+Δt = vt + at Δt + O(Δt2)

In this case, since you are stopping the expansion to the first term, this is a first order approximation. you are going to get a local error of order O(Δt2), that corresponds to a global error of order O(Δt). That's how the truncation error behaves - see this reference for more details.

I know very well of two different integration scheme that improve the Euler method:

and I know of Runge-Kutta methods (find the references from the two cited articles) that are even of higher order, but I have never used them personally.

The order I'm speaking of here is the order of the truncation of the approximation, strictly related to the order of the error performed with the truncation itself.

The Leapfrog method is of second order.

The Verlet method is of third order, with local error O(Δt4). This is a third order method even if you don't see any third order derivative because they cancel out in the derivation, as shown in the wikipedia reference.

An higher order integration scheme let you obtain more precise trajectories without shrinking the time-step, Δt.

However the most important properties these integration methods have, missing on the simple Taylor forward integration, whatever is the truncation order, are:

  • reversibility
  • they are Symplectic, i.e., they do conserve energy

From the references you will find a lot of material to learn it, but a good N-body book would let you learn it in a more structured way.

Just a final note about the very important difference between the two schemes: Verlet method does not give automatically the velocities, that need to evaluated (straightforwardly) as a subsequent step. On the other hand Leapfrog method does evaluate both positions and velocities, but at different times - with this scheme you cannot evaluate both quantities at the same time.

Once you have a good integration scheme you need to worry about what's the maximum error you are able to tolerate, some error analysis is needed at this point, and then you will need to implement multiple-time-scale methods to have accurate integration even in case two objects are too close even for O(Δt4) or larger (thinking to gravitational slings).

They can be global (that is, reduce period everywhere) or local (do it just for some partition of the system, where the particles are too close)... but at this point you should be able to go and find more references by yourself.

Sigi
  • 4,826
  • 1
  • 19
  • 23
  • 1
    I'm glad you mentioned that the OP's method (Euler method) does not conserve energy. – Z boson Dec 18 '14 at 11:52
  • 1
    It is also worth mentioning that Runge-Kutta is not symplectic, and as such is not usually suitable for this type of simulations. For example page 18 of http://arxiv.org/abs/1412.5187 has nice illustrations of how each method performs (as do most introductory books to the subject). – alarge Dec 18 '14 at 14:53
  • 1
    There are symplectic Runge Kutta methods (e.g., symplectic partitioned Runge Kutta). Moreover, whether symplecticity is important depends on the time span of interest. If the time span is less than a million years or so (and if the system is collision-free), a high order, high-precision technique may be "better" than any symplectic technique. Symplectic techniques by themselves won't help if the system is not collision-free. A lot of care and feeding is needed in the case of collisional systems. Setting the initial velocities to zero guarantees that the system is not collision-free. – David Hammen Dec 18 '14 at 22:58
  • I gave @alarge the bounty. Your and his answer are both equally good I think (the voters seem to think so also). He answered first. If I could split the bounty I would but I can't. Part of the OP's question was "This doesn't seem to be consistent with the Newton Law, right?". I would like to have seen a numerical example of why his chosen method violates Newtonian physics (rather than just saying that it does) but at least you mentioned that conservation of energy is violated. – Z boson Dec 19 '14 at 12:03
  • I would have answered even without the bounty, I don't care of it actually: I liked the question. I think that @alarge concentrated too much on the numeric aspects and then introduced "molecular astrophysics" (I'm a computational chemist as well... but I wouldn't introduce LJ to deal with planet collisions), so I don't like his answer, but that's a personal opinion. On the other side my answer was not accurate until I edited it, correcting details that I had forgotten. Missing: some good *astrophysics* N-body references would be the best - but being on the "micro" side I cannot help here. – Sigi Dec 19 '14 at 12:28
  • @Sigismondo What to suggest, to me, depends on the application. I'm assuming that here we're not really interested in the true physics of the system, as if we were, planetary collisions would have to be dealt in a more realistic way, anyway. I agree that adding repulsion or other heuristics for what happens when planets get near to one another is not ideal, but if for example we have a game where one is supposed to survive the attack of planets, the approach makes sense: More accurate algorithms would help but are costly and fundamentally still suffer from the same problems. – alarge Dec 19 '14 at 12:35
  • It occurred to me that my previous comment might not have been very clear. What I am saying is that I absolutely agree with you in that Verlet/velocity Verlet/leapfrog (or possibly something more exotic) should be used and this is indeed something that I should have mentioned in my answer. However, these approaches do not directly address the underlying problem of blowing up and just stating that these integrators should be used does not explain why it happens, either. Anyway, I don't disagree with anything you said. – alarge Dec 19 '14 at 12:58
  • Verlet and Leapfrog give the same (space) trajectories, if correctly initialized. They differ in the velocity computations since Leapfrog computes them at the midpoints of the sampling intervals whereas (velocity) Verlet computes them at the sampling points. Thus both are of order 2, the reference with order 3 is wrong. – Lutz Lehmann Jan 27 '15 at 22:58
  • @lutzl That's something I remembered from a course and found in the reference... I'll double check it and correct the post, thanks. However in molecular dynamics the 2 methods, or even the same method on different architectures, can't give the same trajectories because of the lyapunov instability, letting small errors propagate. Are gravitational systems inrinsecally different in this respect, or does it depend on the number of interacting bodies to your knowledge? – Sigi Jan 29 '15 at 23:16
  • One can eliminate the velocities and then the resulting order 2 recursions are identical, they amount to replacing the second derivative in `x''=a(x)` with the simple central second order difference quotient. Numerical errors may in the end contribute, but before that any error stems from the initialization, the usual formulas make both methods identical, errors there lead to order 1 differences. – Lutz Lehmann Jan 30 '15 at 00:27