6

I'm still fairly new to C++ and have a lot left to learn, but something that I've become quite attached to recently is using nested (multidimensional) vectors. So I may typically end up with something like this:

std::vector<std::vector<std::string> > table;

Which I can then easily access elements of like this:

std::string data = table[3][5];

However, recently I've been getting the impression that it's better (in terms of performance) to have a single-dimensional vector and then just use "index arithmetic" to access elements correspondingly. I assume this performance impact is significant for much larger or higher dimensional vectors, but I honestly have no idea and haven't been able to find much information about it so far.

While, intuitively, it kind of makes sense that a single vector would have better performance than a a higher dimensional one, I honestly don't understand the actual reasons why. Furthermore, if I were to just use single-dimensional vectors, I would lose the intuitive syntax I have for accessing elements of multidimensional ones. So here are my questions:

Why are multidimensional vectors inefficient? If I were to only use a single-dimensional vector instead (to represent data in higher dimensions), what would be the best, most intuitive way to access its elements?

Justin
  • 24,288
  • 12
  • 92
  • 142
  • 2
    Here ya go: http://www.boost.org/doc/libs/1_64_0/libs/multi_array/doc/user.html – GManNickG Jul 26 '17 at 03:48
  • 1
    *Why are multidimensional vectors inefficient?* I haven't done any benchmarks. However, I am going to guess that cache misses will be the most significant contributor to degraded performance. See – R Sahu Jul 26 '17 at 03:49
  • Are you using the jaggedness of your multidimensional vectors or not? – Yakk - Adam Nevraumont Jul 26 '17 at 03:50
  • 3
    It *is* less efficient, but it's that big of a deal. The problem is that you could have all of your elements right next to each other in memory, but if you nest vectors like this, they will be more spread out. For most applications, that actually doesn't matter. C++ programmers do tend to overemphasize efficiency in small things like this, but the truth is that if you are treating these as rectangular tables / matrices, if you find that they are a bottleneck, you can replace them with another type really easily – Justin Jul 26 '17 at 03:56
  • 1
    Do note that you actually have two questions here. That means that this question could be closed as *too broad* – Justin Jul 26 '17 at 03:57
  • 1
    Another problem has to do with indirection. The outer vector will allocate an array of vectors (i.e. it has a `std::vector* data` inside of it). When you want to fetch a value `v[i][j]`, the outer vector *has* to derefence the pointer `data[i]`, and then the inner vector has to dereference to get `[j]`. This means two memory accesses vs. a single one. – Misguided Jul 26 '17 at 06:06
  • 2
    https://stackoverflow.com/questions/41400116/2d-vector-vs-1d-vector, this provides answers. At the points where a new vector is stored there will be strides, due to the member variables stored in the vector. Performance problem will be due memory layout. – Hakes Jul 26 '17 at 07:08
  • @Yakk, yes, I may use jagged vectors at times. – JohnTravolski Jul 26 '17 at 13:30
  • somewhat-duplicate of https://stackoverflow.com/a/25611857/2757035 – underscore_d Jul 27 '17 at 11:58

1 Answers1

7

It depends on the exact conditions. I'll talk about the case, when the nested version is a true 2D table (i.e., all rows have equal length).

A 1D vector usually will be faster on every usage patterns. Or, at least, it won't be slower than the nested version.

Nested version can be considered worse, because:

  • it needs to allocate number-of-rows times, instead of one.
  • accessing an element takes an additional indirection, so it is slower (additional indirection is usually slower than the multiply needed in the 1D case)
  • if you process your data sequentially, then it could be much slower, if the 2D data is scattered around the memory. It is because there could be a lot of cache misses, depending how the memory allocator returns memory areas of different rows.

So, if you go for performance, I'd recommend you to create a 2D-wrapper class for 1D vector. This way, you could get as simple API as the nested version, and you'll get the best performance too. And even, if for some cause, you decide to use the nested version instead, you can just change the internal implementation of this wrapper class.

The most intuitive way to access 1D elements is y*width+x. But, if you know your access patterns, you can choose a different one. For example, in a painting program, a tile based indexing could be better for storing and manipulating the image. Here, data can be indexed like this:

int tileMask = (1<<tileSizeL)-1; // tileSizeL is log of tileSize
int tileX = x>>tileSizeL;
int tileY = y>>tileSizeL;
int tileIndex = tileY*numberOfTilesInARow + tileX;
int index = (tileIndex<<(tileSizeL*2)) + ((y&tileMask)<<tileSizeL) + (x&tileMask);

This method has a better spatial locality in memory (pixels near to each other tend to have a near memory address). Index calculation is slower than a simple y*width+x, but this method could have much less cache misses, so in the end, it could be faster.

geza
  • 28,403
  • 6
  • 61
  • 135
  • 1
    Even better, don't create the wrapper yourself, use something like Boost.MultiArray. – Sebastian Redl Jul 26 '17 at 10:57
  • If you are tile based accessing, access "flowing over" from one tile to another is rarer than "one line to another" or "one line to the next" in my experience; permitting indepdenant allocations can make more sense there (as they are easier to manage). The cost of cache misses scales inversely with frequency and predictability. – Yakk - Adam Nevraumont Jul 26 '17 at 13:35
  • @Yakk: exactly, that's the point of tiling :) But I fail to see how independent allocations make more sense here. – geza Jul 26 '17 at 13:42
  • @SebastianRedl: yes, if one already uses boost. But using boost for just a little wrapper, which can be written in ~10 minutes, maybe not the best idea. Compiling with boost is slow. Just adding `#include ` into a .cpp file increases compile time by 1 sec. If this array is a central one, and used in 1000 files, for 4 cpu cores, it increases compile time by 4 minutes. So I don't call this "better" for every scenarios (for example, one of our project, which contains ~1000 cpp files, compiles in 40 seconds. Boosting this 40 sec to be 4:40, is not good). – geza Jul 26 '17 at 13:48
  • @geza not more sense, but the cost of independant allocations is lower for tiles than for lines in general. – Yakk - Adam Nevraumont Jul 26 '17 at 14:22