4

I have an object like this:

class Node {
    float x, y, z;
    size_t tag;
    bool isFree;
    std::vector<size_t> connections; // Usually ~10-100 in length
};

Just to give you an idea of size. There is list of these node objects containing millions of instances, which I'll call std::vector<Node> masterNodes. I have a function elsewhere that returns a container of these objects, such as this one:

std::vector<Node> find_nodes()
{
    std::vector<Node> nodes;
    // copy some elements from masterNodes that meet our conditions
    return nodes;
}

My question is would it be more efficient to return a vector of Node* instead, or is my compiler going to optimize this enough that the gain would be minimal for objects like mine? E.g.

std::vector<Node*> find_nodes()
{
    std::vector<Node*> nodes;
    // point to some elements from masterNodes that meet our conditions
    return nodes;
}

I've seen some replies (such as this one) that suggest copies might be nearly as efficient as returning a pointer, acknowledging the danger of returning pointers to vector elements.

Community
  • 1
  • 1
Fadecomic
  • 1,220
  • 1
  • 10
  • 23
  • If each `Node` has a `std::vector` of potentially 100 elements, and you have a `std::vector`, that sure will be a lot larger than `std::vector`. Just make sure you know who should be managing that memory so you aren't leaking it, and aren't holding onto dangling pointers. – Cory Kramer Mar 27 '15 at 14:23
  • 2
    It is quite likely the vector of objects will be more efficient. But if it matters, you should try both and see (but include potential memory management needed for the second version in your benchmarks.) – juanchopanza Mar 27 '15 at 14:34
  • Another option is to return indices, I suppose. But that ties me to a vector, and it means that I have to reference the index back to masterNodes. It's not a totally unappealing idea, though, because indices are in the problem domain, too, and not just the implementation. Nodes have indices. – Fadecomic Mar 27 '15 at 15:07
  • I think the 'returning' isn't really relevant here. The copying of nodes from masterNodes is the the real issue here. Once the copy is complete, the return should be very efficient (thanks to RVO). RVO can apply, and will make the return very efficient. But we don't care about a fast return, when the copying within `find_nodes()` is itself slow. – Aaron McDaid Mar 27 '15 at 15:23
  • 1
    What are you doing with the vector of nodes retrieved from `find_nodes()`? This is an important detail in determining the most appropriate solution to your question. – Julian Mar 27 '15 at 15:52
  • Well this is by no means definitive, but I ran this test case: http://pastebin.com/TFCta1Ud with O2 optimization using g++ v 4.8.2, and the difference was extreme. Using a copy, this code with one million nodes ran in 10s on my machine, and using pointers, it ran in 0.5s. It's a mockup, and hastily coded, so please be gentle. – Fadecomic Mar 27 '15 at 18:10

4 Answers4

1

It would it be more efficient to return a vector of Node*, because your nodes is a vector of copies of Nodes from masterNodes, and your Node is much bigger than a pointer. Nothing like Return Value Optimisation or move-semantics can help with the fact that you have (and return) a vector of copies.

BTW, you may return a vector<vector<Node>::iterator> instead ofvector<Node*>. It is as efficient as Node*, at least in the release build, but usually has some integrated checks in the debug build, which could help.

srdjan.veljkovic
  • 2,468
  • 16
  • 24
1

Locality of reference

Real-life performance is very much dependant on hardware and if you know how to use it, much can be gained.

One of the biggest hardware induced performance gains can be made when utilizing locality of reference. This means that working with data located in close proximity, both in time and space, can make better use of the built-in CPU cache which is much faster than using the main memory (RAM).

This is why copying data, to allow contiguous local access, can give you a performance boost.

The opposite of this is using indirection. Indirection is the ability to access memory using a reference or pointer instead of the value itself. This allows you to avoid copying things, BUT you can not make good use of the CPU cache when the hardware has to fetch every bit of data from different places in main memory all the time.

Performance must be tested

Basically, copying big things will incur a one-time performance penalty but if you will work a lot with the data you can make up for this using locality of reference.

However, you have to test it yourself to know what works best for you. It might be that in your case the cost of copying the data will incur a bigger performance penalty than better use of the CPU cache will make up for.

Felix Glas
  • 15,065
  • 7
  • 53
  • 82
  • I meant to acknowledge this in the question, sorry. The size of the return from `find_nodes()` is much smaller than the number of times that `find_nodes()` will be called. E.g. find_nodes() will be called on the order of the size of masterNodes (millions), but the result will be on the order of 10-100 nodes. – Fadecomic Mar 27 '15 at 15:21
  • 1
    @Fadecomic Well, my reasoning comes into play when you start *working* with the data (after the return from `find_nodes`). If all you do is measuring the time it takes to copy objects vs pointers, then of course copying pointers will be faster. – Felix Glas Mar 27 '15 at 15:32
0

When you use std::vector<Node> as method return, you duplicate all data and that takes time. Using std::vector<Node*> allows you to only have addresses of data and no duplication is done. But if you use this choice, you have to be careful with modifications of data because modifications are done in your masterNodes.

0

You should try std::copy_if algorithm, according to the reference:

In practice, implementations of std::copy avoid multiple assignments and use bulk copy functions such as std::memmove if the value type is TriviallyCopyable.

You can make your Node implementation match the requirements to be considered TriviallyCopyable (use std::array, instead std::vector for connections), so using std::copy_if should be very fast.

On the other hand, copying nodes is limited by the memory, if you don't have enough memory, you could get an out of memory error, if you are sure you never will return more that 100 nodes, well, you have this controlled.

And if you work with pointers, the application would have to manage the memory, this decrease the ammount of memory used, but can encrease the time needed due memory management.

But the best answer you will get, is your test both options.

Raydel Miranda
  • 13,825
  • 3
  • 38
  • 60