4

Consider the following code:

std::vector<std::pair<char, char>> v;

for (const auto &p : v) {
   // ...
}
for (const auto &[first, second] : v) {
   // ...
}

There is a vector of pairs of chars and two ways to iterate over the elements using range for loop. In the first case p becomes a reference to a pair while in the second case (if not optimized) first and second become references to the elements of the pair. But the pair of chars is just 2 bytes long while even a single reference is larger. So, I assume that the following code should be more efficient (again, not considering optimization):

std::vector<std::pair<char, char>> v;

for (const auto p : v) {
   // ...
}
for (const auto [first, second] : v) {
   // ...
}

It should be more efficient to copy 2 bytes to local variables and use them at known offsets rather than to create a reference and access the elements of the pair dereferencing it. But clang produces warnings (that are taken as errors in my system):

loop variable 'p' creates a copy from type 'const std::pair<char, char>'
use reference type to prevent copying
loop variable '<structured bindings>' creates a copy from type 'const std::pair<char, char>'
use reference type to prevent copying

The same warning is produced if I use std::list instead of std::vector.

Is that always true that iterating pairs by reference is more efficient than by value (and why if that is true). If not, is that a bug in clang that it is over restrictive?

Dmitry Kuzminov
  • 6,180
  • 6
  • 18
  • 40
  • 3
    *It should be more efficient to copy 2 bytes to local variables and use them at known offsets rather than to create a reference and access the elements of the pair dereferencing it* Yes, but the thing with references is they are very easy to optimize away and the compiler will just work with the object directly in the vector. You'd need to check the assembly, or use a profiler, but my money is references being faster. They are also needed if you want to modify. – NathanOliver Feb 24 '23 at 18:32
  • 3
    "more efficient" and "not considering optimization don't play well together. Optimization can turn around what seems micro-optimal. – lorro Feb 24 '23 at 18:33
  • @NathanOliver, my main question is whether clang is right in that I always need to prefer a reference in this case. – Dmitry Kuzminov Feb 24 '23 at 18:34
  • 2
    @463035818_is_not_a_number, I guess that compilers should avoid "opinion-based" warnings, like we avoid opinion-based questions at StackOverflow. – Dmitry Kuzminov Feb 24 '23 at 18:38
  • 1
    @DmitryKuzminov The warning is likely clangs way of say hey, if you don't actually need a copy, use a reference, I've got the ability to make it faster if I do. I don't know about you but I appreciate that from the compiler. – NathanOliver Feb 24 '23 at 18:41
  • anyhow, "is it more efficient?" can be answered by looking at the compilers output and by measuring. I'd expect to get the same when optimizations are tunred on – 463035818_is_not_an_ai Feb 24 '23 at 18:43
  • 2
    Your mental model of "creating a reference" should be "creates an alias that means exactly the same thing as the thing" (but the reference doesn't control the lifespan of the object, except in the cases where it does). – Eljay Feb 24 '23 at 18:44
  • 2
    I think this warning is actually aimed at eliminating a common typo when someone forgets to add `&` while iterating *over content* of container. – user7860670 Feb 24 '23 at 18:47
  • @NathanOliver *The warning is likely clangs way* I think the question is exatly about clang having a valid point here or not. It is made by mere mortals and is not expected to be universally correct, it can have bugs. Thus the answer may be "No, Clang is not always correct", or "This question can not be answered because the Standard does not regulate this" – Sergey Kolesnik Feb 24 '23 at 18:49
  • It seems like a clang-tidy warning though – Sergey Kolesnik Feb 24 '23 at 18:52
  • @Eljay, your point is interesting, but it doesn't hold for iterating `vector` by value: and clang doesn't complain on that. – Dmitry Kuzminov Feb 24 '23 at 18:54
  • Warnings are not always right; if they were always right, they'd be errors. – user253751 Feb 24 '23 at 19:00
  • @user253751, that is not true. For example, UB is not an error while it is not correct. – Dmitry Kuzminov Feb 24 '23 at 19:04
  • 2
    @DmitryKuzminov what? UB is not a warning, either. – user253751 Feb 24 '23 at 19:07
  • *Is that always true that iterating pairs by reference is more efficient than by value (and why if that is true).* **Probably,** but you'd have to check the optimized assembly or profile to know for sure. *If not, is that a bug in clang that it is over restrictive?* Other than required diagnostics, elective warnings are opt-in. Some warnings of debatable utility, and are only of use in limited contexts (e.g., I opt-out of this one: `-Wno-padded`). – Eljay Feb 24 '23 at 20:30
  • IMHO, you should always prefer to not copy large data structures; it's just a waste of time. If the data structure can fit inside a processor's word, the copying is less significant. – Thomas Matthews Feb 24 '23 at 21:15
  • To quote you: "It should be more efficient to copy 2 bytes to local variables and use them at known offsets rather than to create a reference and access the elements of the pair dereferencing it." Using a reference and accessing the bytes using a known offset is the same as what you wrongly describe as "more efficient". It is more efficient NOT to make a copy, if the all other operations are the same. – Michaël Roy Feb 25 '23 at 04:17
  • @MichaëlRoy, *if* – Dmitry Kuzminov Feb 25 '23 at 05:30
  • @DmitryKuzminov Since in both case, the operations are 2 indirect memory accesses at an address + offset, they are indeed the same. So an extra copy involves an extra operation. That's the reason why clang issues a warning. – Michaël Roy Feb 27 '23 at 16:47

0 Answers0