1

in the program I'm working on I have 3-element arrays, which I use as mathematical vectors for all intents and purposes.

Through the course of writing my code, I was tempted to just roll my own Vector class with simple arithmetic overloads (+, -, * /) so I can simplify statements like:

// old:
for (int i = 0; i < 3; i++)
    r[i] = r1[i] - r2[i];

// new:
r = r1 - r2;

Which should be more or less identical in generated code. But when it comes to more complicated things, could this really impact my performance heavily? One example that I have in my code is this:

Manually written version:

for (int j = 0; j < 3; j++)
{
    p.vel[j] = p.oldVel[j] + (p.oldAcc[j] + p.acc[j]) * dt2 + (p.oldJerk[j] - p.jerk[j]) * dt12;
    p.pos[j] = p.oldPos[j] + (p.oldVel[j] + p.vel[j]) * dt2 + (p.oldAcc[j] - p.acc[j]) * dt12;
}

Using the Vector class with operator overloads:

p.vel = p.oldVel + (p.oldAcc + p.acc) * dt2 + (p.oldJerk - p.jerk) * dt12;
p.pos = p.oldPos + (p.oldVel + p.vel) * dt2 + (p.oldAcc - p.acc) * dt12;

I am attempting to optimize my code for speed, since this sort of code runs inside of inner loops. Will using the overloaded operators for these things affect performance? I'm doing some numerical integration of a system of n mutually gravitating bodies. These vector operations are extremely common so having this run fast is important.

Any insight would be appreciated, as would any idioms or tricks I'm unaware of.

Mike Bailey
  • 12,479
  • 14
  • 66
  • 123
  • 2
    You might want to look into expression templates. But keep in mind to get accurate answers you'll need to profile any solutions you try. – GManNickG Apr 22 '10 at 05:47
  • Also, some compilers not only inline for you, they also unroll short loops that has a definite number of executions (like 3). You could look at the disassembly and see if that's true, or you could do what other have suggested, and do some benchmarking to see if overloading operators is much faster. – Xavier Ho Apr 22 '10 at 06:12
  • Xavier, I compiled the operator overloaded and the manually written version along with the assembly + source .asm files. Most of the loops were extremely similar. Some, like the one I posted above, had very long chains of assembly for the operator overload based one, and much shorter chains with my manual loops. Still, as I commented below, my Vector version was faster still. – Mike Bailey Apr 22 '10 at 06:33
  • You should tag your question with C++ tag. – Vicente Botet Escriba Apr 25 '10 at 12:42

3 Answers3

2

If the operations are inlined and optimised well by your compiler you shouldn't usually see any difference between writing the code well (using operators to make it readable and maintainable) and manually inlining everything.

Manual inlining also considerably increases the risk of bugs because you won't be re-using a single piece of well-tested code, you'll be writing the same code over and over. I would recommend writing the code with operators, and then if you can prove you can speed it up by manually inlining, duplicate the code and manually inline the second version. Then you can run the two variants of the code off against each other to prove (a) that the manual inlining is effective, and (b) that the readable and manually-inlined code both produce the same result.

Before you start manually inlining, though, there's an easy way for you to answer your question for yourself: Write a few simple test cases both ways, then execute a few million iterations and see which approach executes faster. This will teach you a lot about what's going on and give you a definite answer for your particular implementation and compiler that you will never get from the theoretical answers you'll receive here.

Jason Williams
  • 56,972
  • 11
  • 108
  • 137
  • Thanks, I went ahead and littered my code with some #ifdefs to test the manual written vs operator versions. The Vector based code was actually slightly faster (a few hundred milliseconds faster). Not much compared to the several minute run time, but it's certainly something sizable. – Mike Bailey Apr 22 '10 at 06:30
2

I would like to look at it the other way around; starting with the Vector class, and if you get performance problems with that you can see if manually inlining the calculations is faster.

Aside from the performance you also mention that the calculations has to be accurate. Having the vector specific calculations in a class means that it's easier to test those individually, and also that the code using the class gets shorter and easier to maintain.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
1

Check out the ConCRT code samples

http://code.msdn.microsoft.com/concrtextras/Release/ProjectReleases.aspx?ReleaseId=4270

There's a couple (including an NBody sample) which do a bunch of tricks like this with Vector types and templates etc.

Ade Miller
  • 13,575
  • 1
  • 42
  • 75