8

I'm at a performance bottleneck in my program where I need to access elements from an array millions of times in a tight loop.

I looked around and the general consensus seems to be that even though multidimensional arrays should be faster, their underlying implementation is inefficient so just use jagged arrays instead. I profiled it and sure enough, jagged arrays are 50% faster. Fine.

However, I also tried just doing manual indexing (as in, simulating the behavior of a multidimensional array by just doing something like this: object value = array[i * 24 + j]; (where 24 is an array size) and accessing it through a single-dimension array with multiplication to simulate multidimensional arrays.

Surprisingly, this was also faster than a jagged array by around 15% for accesses (all I care about). This saddens me because for one, it seems silly that manually recreating multidimensional arrays is this much faster than C#'s built-in implementation and two, the math involved to get to indicies is uglier compared to just indexing with jagged/multidimensional arrays.

Is there anything I can do to recoup the speed benefits without having to use my own manual indexing? Surely there is some sort of optimization that can be set or checked to simulate this behavior? Why is the C# implementation of arrays so inefficient?

OmG
  • 18,337
  • 10
  • 57
  • 90
John Smith
  • 83
  • 4
  • The built in multi dimensional arrays implementation applies bounds checks on every coordinate, yours only a single bounds check. I didn't check, but it might even apply a second bounds-check for the lower bound (which is not required to be zero). – CodesInChaos Mar 12 '15 at 15:28
  • 1
    Because C# has optimized single dimensional arrays... Multidimensional arrays are a forgotten "product"... They aren't even supported by LINQ. Jagged arrays are slower than manual multiplication because with jagged arrays you have to jump around to take the reference to the nested array. So you do a bound check on the external array, an array access to read the reference of the inner array, a bound check on the inner array and finally an array access to read the final data. All of this for a 2D array. For a 3D array it's clearly longer. – xanatos Mar 12 '15 at 15:28
  • Note that modern CPUs are *very* fast at the integer math involved to find an index. (Newer CPUs also support [FMA](http://en.wikipedia.org/wiki/FMA_instruction_set), but I'm not sure if the CLR takes advantage of such.) – user2864740 Mar 12 '15 at 15:54

1 Answers1

5

Surprisingly, this was also faster than a jagged array by around 15% for accesses

This should come as no surprise at all, because indexing a jagged array requires an additional dereference. When you write a[i][j], computer must do the following:

  • Compute the location of nested array i inside the jagged array a
  • Fetch the location of the nested array a[i] (first dereference)
  • Compute the location of element j in a[i]
  • Fetch the value at location j of a[i] (second dereference)

When you fold 2D array in a vector, the computer does only one dereference:

  • Compute the location of the target element
  • Fetch the value from the array (the only dereference)

Essentially, you are trading a dereference for a multiplication; multiplication is cheaper.

In addition, you get continuity of elements in memory - something that you cannot guarantee with a jagged array. This becomes important for code that is sensitive to cache hits.

Is there anything I can do to recoup the speed benefits without having to use my own manual indexing?

Using your indexing scheme is a way to go. You can hide it from viewers of your code by making a class, say, Matrix2D exposing an operator [] that takes two indexes and produces the value. This way the code that computes the offset would be hidden from readers of your program, because the a[i * 24 + j] part would look like a[i, j]

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Does this answer change when using a jagged array where the low index (j) changes the most frequently - ie. over for i, for j iteration - and `a[i]` is assigned to a temporary variable (such that it removes deopt cases or confusion when discussing the operation occurring)? I would not be surprised if this was "just as fast" in such an iterative (non-random access) case. A number of implementations (with jagged arrays or not) take advantage of a *deliberate* row or column access, depending on layout, for locality. – user2864740 Mar 12 '15 at 15:48
  • 1
    @user2864740 The pattern of accessing your array does change the actual speedup. Getting to "just as fast" would be difficult even then, because in-memory continuity of the "flat" array gives it considerable advantage, because of relatively small dimension size of only 24. Had the other dimension be much larger, say, in 1000s, the difference wouldn't be as noticeable. – Sergey Kalinichenko Mar 12 '15 at 15:55