5

I am about to write an off-lattice diffusion limited aggregation (DLA) simulation and I am wondering whether to use C or C++.

C++ would be nice for design reasons but I am wondering if C would perform better. Of course I know about algorithm performance and have chosen the best possible algorithm. So I'm not talking about improving O(n^2) to O(log n) or similar. I'm trying to reduce my constants so to speak.

If you do not know DLA it basically boils down to having an array of doubles (size between 10^3 and 10^6) and in a loop choosing random doubles to compare (greater/less than ) to large portions of the array.

So performance differences which would matter for this is data access and calling functions:

  • Data access: C struct vs. C++ class with public data members vs. C++ class with private data members and accessors.
  • Calling functions: C functions vs. C++ member functions.

Am I right to conclude that the ultimate way to judge this is to look at the assembly code (e.g. comparing the number of moves/loads, jumps and calls)? This is of course compiler dependent (e.g. you could compare an awful C compiler to a good C++ compiler). I'm using the Gnu compilers (gcc and g++).

I've found that the assembly produces by gcc and g++ is almost identical in terms of the number of jumps (none), moves/loads and calls for the two following programs:

C program

#include <stdlib.h>

typedef struct 
{
    double x;
} particle;

double square(double a)
{
    return a*a;
}

int main()
{

    particle* particles = malloc(10*sizeof(particle));
    double res;

    particles[0].x = 60.42;

    res = square(particles[0].x);

    return 0;
}

C++ program

class particle
{
    public:
        double x;

    public:
        double square()
        {
            return x*x;
        }

};

int main()
{

    particle* particles = new particle[10];
    double res;

    particles[0].x = 60.42;

    res = particles[0].square();

    return 0;
}

If I use private member data in the C++ program I of course get another call in the assembly when I call particles[0].setx(60.42).

Does this mean I might as well choose C++ as C since they produce almost the same assembly code? Should I avoid private member data since it adds extra function calls (e.g. is call in assembly expensive)?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Wuhtzu
  • 161
  • 2
  • 5
  • 1
    C++ is just as and can be more performant than C. How are you calling g++? I'm very surprised it didn't inline the `square()` method if they're on – Collin May 25 '13 at 16:58
  • No point making a `setx` method if the only thing it does is setting x to whatever is passed in you might as well make the data public. Also using C++ doesn't mean you can't use structs with just members. – Barış Uşaklı May 25 '13 at 16:58
  • If your algorithm is depending on large arrays of numbers, you might want to avoid using classes with member variables as in your example because of how a class is laid out in memory. You might see better performance with large continuous allocations of doubles instead of continuous allocations of classes with mixed member data. – PureW May 25 '13 at 17:05
  • Generate assembly language listings of the two programs. You'll be amazed at the subtle, few, differences. – Thomas Matthews May 25 '13 at 17:18
  • If you allocate variables dynamically once, the time spent in allocation is not a performance issue. Your performance issue will be in processing the data and presenting or applying the data. – Thomas Matthews May 25 '13 at 17:20
  • @ThomasMatthews The allocation should not be a problem, I'm thinking more along the lines of making sure that data, his `x` lies in a large continuous chunk, thus keeping as much data as possible in the cpu's cache. – PureW May 25 '13 at 18:05

2 Answers2

11

Given the types of things you're outlining, I'd be surprised to see a significant advantage to C for any of it. Based on what you've said, I'd also guess the comparison(s) you've done are based on compiling with little or no optimization enabled. With full optimization enabled, I'd expect even those to disappear.

In the long term, C++ offers more chances for optimization. One that's fairly common with matrix arithmetic (though I'm not sure it's applicable to your DLA simulation) is expression templates, which you can use to "flatten" a computation to avoid copying of the data that would otherwise be necessary.

Bottom line: at very worse, C++ will end up precisely equivalent to C (i.e., in the very worst case, you'd write your C++ code virtually the same as C code, and see no difference in performance). At best, the extra features of C++ (especially templates) offer you chances to optimize in ways that are either impossible or grossly impractical with C.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
1

In the case you posted, you are basically asking the performance difference between malloc and new.

In C++ the new operator performs additional functionality, such as calling the object's constructor or initializing the variables.

For equivalent operation in C, you will need to initialize variables after calling malloc. Change your C program to behave more like C++ then profile.

In most performance questions, you need to compare apples to apples. If you use Object Oriented Programming in C++, you will need to code the equivalent in C++ before making comparisons. Given this, in most cases, there is no negligible difference in performance between the two languages.

Also consider: development time, correctness, robustness, safety and your confidence or experience with the language. Performance is usually negligible when compared to these other attributes of a project.

Many performance issues depend not on the language but the design and the platform. Cache misses, loop unrolling, many comparisons, can all affect performance regardless of the language.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
  • Good points. I was not worried about new vs. malloc. I was worried that there might be an overhead in obtaining data members of a class comparted to data stored in structs or data stored directly in an array. Before I looked at the assembly generated from C++ retrieving data members I feared it might involve several jumps etc. which would render it less performing compared to other data accesses. But it turns out to almost the same assembly code so I assume it is equivalent in terms of performance. I've never written a compiler for an object oriented language. – Wuhtzu May 27 '13 at 06:16