-2

So, I have a situation where I have to deal with huge (multidimensional) arrays and just wondered whether C# or C++ will perform better. (Note: I'm a beginner at C++, so don't expect too much knowledge about how it works!). I thought in terms of arrays both languages will perform similarly, maybe C++ could be slightly better BUT: The results tell other stories:

1012ms in C++ @32Bit
1020ms in C++ @32Bit
1002ms in C++ @32Bit

1155ms in C++ @64Bit
1098ms in C++ @64Bit
1122ms in C++ @64Bit
1136ms in C++ @64Bit


523ms in C# @32-Bit
545ms in C# @32-Bit
537ms in C# @32-Bit
536ms in C# @32-Bit

473ms in C# @64-Bit
501ms in C# @64-Bit
470ms in C# @64-Bit
498ms in C# @64-Bit

I performed one Test run at x86 and one at x64 architecture. Two things here: Why does C# do nearly two times better than C++? And why is C# actually faster in x64-Mode and C++ in x86-Mode?!? I really didn't expect that to happen.

As I told, I'm currently not that experienced in C++-Programming, but I tried my best reproducing my C#-Code in C++.

Here goes the code: C#

for (int j = 0; j < 4; j++)
{
    Stopwatch sw = new Stopwatch();
    sw.Start();

    struct1[] s1 = new struct1[20000000];
    int length = 20000000;

    for (int i = 0; i < length; i++)
    {
        s1[i] = new struct1();
        s1[i].samplechar = 'c';
        s1[i].sampleInt = i * 2;
        s1[i].sampledouble = Math.Sqrt(i);
    }

    sw.Stop();
    GC.Collect();
    Console.WriteLine(sw.ElapsedMilliseconds + "ms in C# @...-Bit");
}

AND struct1:

public struct struct1
{
    public int sampleInt;
    public double sampledouble;
    public char samplechar;
}

C++:

for (int j = 0; j < 4; j++)
{
    auto begin = std::chrono::high_resolution_clock::now();

    struct1* s1 = new struct1[20000000];
    int length = 20000000;

    for (int i = 0; i < length; i++)
    {
        s1[i].sampleChar = 'c';
        s1[i].sampleInt = i * 2;
        s1[i].sampleDouble = sqrt(i);
    }
    auto end = std::chrono::high_resolution_clock::now();       

    std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count() << "ms in C++ @64Bit" << std::endl;

    free(s1);

}

struct1:

struct struct1 {
public:
    int sampleInt;
    int sampleDouble;
    char sampleChar;

};

Note: I did not include Garbage Collection/Free into performance measurement, since - in my case - there will be one huge array existing as long as the program is running. And I think it's clear that arrays of such sizes will be created/deleted too often unless you want to kill your machine...

Note2: Another confusing thing: While C++ consumes about 250MB of RAM, C# takes 500MB. But why?

Thanks in advance for any kind of explanation. Maybe I'm just getting this wrong and the failure just sat behind the display, but I'm although interested in why I get these results.

Edit: I'm running Visual Studio 2017RC on Windows 10. C++ Optimisation is disabled (/Od)

Florian
  • 36
  • 3
  • 1
    Are you compiling with optimizations enabled? – Vittorio Romeo Mar 11 '17 at 22:53
  • Compiler version, switches and optimisation level? – Richard Critten Mar 11 '17 at 22:53
  • Online compiler is way faster than your measurements with `-O1`: http://melpon.org/wandbox/permlink/uqPzpCbt5dUWOxtQ – Vittorio Romeo Mar 11 '17 at 22:57
  • 2
    Obligatory link when someone asks "which is faster?" https://ericlippert.com/2012/12/17/performance-rant/ – DavidG Mar 11 '17 at 22:57
  • 1
    Side note: if you create the array with `new[]`, you should destroy it using `delete[] s1;` instead of `free(s1)` (although you should use none of these in c++11 anyway ) – kennytm Mar 11 '17 at 22:59
  • 2
    "Edit: I'm running Visual Studio 2017RC on Windows 10. C++ Optimisation is disabled (/Od)", there is your answer. – ThreeFx Mar 11 '17 at 22:59
  • Why are you using a int instead of a double in the c++ code ? – Hannes Hauptmann Mar 11 '17 at 23:05
  • 2
    *Optimisation is disabled (/Od)* -- Thus your numbers and findings are meaningless. Until you post results from an optimized build, there isn't a need to discuss "which is faster". – PaulMcKenzie Mar 11 '17 at 23:06
  • 1
    Some basic mistakes, you compared the optimized C# code against the unoptimized C++ code. And you declared sampleDouble as an *int* instead of a *double*. And *char* in C++ does not match *Char* in C#. The latter two mistakes explain why C++ used less memory. Actual perf of the C++ program is about 60% of the C# program, the array index checking in C# and the C++ optimizer doing a better job of enregistering the variables explains the difference. – Hans Passant Mar 12 '17 at 01:19

2 Answers2

0

With the correct optimizations and compiler settings, the C++ version should be at least as fast as the C# version. Keep in mind that you're not measuring the timings exactly the same (see if you can try it using a StopWatch in C++?) - see this question for more details on that: resolution of std::chrono::high_resolution_clock doesn't correspond to measurements - and that's very likely the source of your different timings (particularly why the C++ implementation is measuring as taking longer). StopWatch is more accurate than what you're using in the VisualStudio/Windows implementation of the std::chrono::high_resolution_clock.

Also keep in mind that you're not actually dealing with multi-dimensional arrays (e.g. int[,] in C# parlance) - you're just dealing with an array of structs. For what it's worth, my recollection is that the CLR implementation of actual multidimensional arrays is not quite built for performance on large collections - it's probably possible to build a faster one (if in fact you want multi-dimensional arrays instead of just arrays of structs).

The bigger deal here would likely be around being able to more tightly control memory management - if you're allocating large arrays of structs, you can control the lifetime of that allocation much more closely using C or C++ than you can in a garbage collected environment. You can also more easily/naturally use pointer arithmetic and memory copying - although that's entirely possible in unsafe contexts in C# as well. This may result in some speed performance gains compared to a "vanilla" C# implementation in a safe context.

Community
  • 1
  • 1
Dan Field
  • 20,885
  • 5
  • 55
  • 71
0

In addition to Dan Field's answer:

Note2: Another confusing thing: While C++ consumes about 250MB of RAM, C# takes 500MB. But why?

In C++ you use int,int,char which (on a 32 bit system at least) will be padded to 4,4,4 bytes (unless packing of structs is set to 1 byte for example).

In C# you use int,double,char which will be padded to 8,8,8( or maybe 4?) bytes. The first int is padded to align the double on 8 bytes. If double,int,char was used the struct should be smaller (16 bytes probably).

While I'm not a C# expert, I don't understand why you use s1[i] = new struct1(); in the loop while there is already an arary of struct1's? (but I could be wrong or missing something).

Also, when testing array performance I wouldn't use sqrt() in the loop, because that function is more expensive then looping through the items.

And of course, optimizing should be enabled when testing performance.


Using loop enrolling could even be faster:

for (int i = 0; i < length; i+=4)
{
    s1[i].sampleInt = i * 2;
    s1[i].sampleDouble = i * 4;
    s1[i].sampleChar = 'c';    // reordered in order of struct
                               // not sure if it matters
    s1[i+1].sampleInt = (i+1) * 2;
    s1[i+1].sampleDouble = (i+1) * 4;
    s1[i+1].sampleChar = 'c';
                               // maybe 2 or 3 would be better than 4
    s1[i+2].sampleInt = (i+2) * 2;
    s1[i+2].sampleDouble = (i+2) * 4;
    s1[i+2].sampleChar = 'c';

    s1[i+3].sampleInt = (i+3) * 2;
    s1[i+3].sampleDouble = (i+3) * 4;
    s1[i+3].sampleChar = 'c';
}

Compared to

for (int i = 0; i < length; i++)
{
    s1[i].sampleChar = 'c';
    s1[i].sampleInt = i * 2;
    s1[i].sampleDouble = i * 4;
}
Danny_ds
  • 11,201
  • 1
  • 24
  • 46