11

I know that the postfix versions of the increment/decrement operators will generally be optimised by the compiler for built-in types (i.e. no copy will be made), but is this the case for iterators?

They are essentially just overloaded operators, and could be implemented in any number of ways, but since their behaviour is strictly defined, can they be optimised, and if so, are they by any/many compilers?

#include <vector> 

void foo(std::vector<int>& v){
  for (std::vector<int>::iterator i = v.begin();
       i!=v.end();
       i++){  //will this get optimised by the compiler?
    *i += 20;
  }
}
Dominic Gurto
  • 4,025
  • 2
  • 18
  • 16
  • This is an interesting question, even if it **is** a micro-optimization. – jpm Jun 21 '11 at 02:10
  • unless iterator operations have side effects, compiler is allowed to optimized postincrement version following *as-if* rule. Whether it does or not depends on compiler. Indebug build, it probably won't be optimized, so why make debugging slower? I don't see a problem of developing a good habit of using postincrement only when you actually need it. – Gene Bushuyev Jun 21 '11 at 02:23
  • @Gene I agree, and I'm in the habit of using pre-increment whenever I can. I'm just curious :) – Dominic Gurto Jun 21 '11 at 02:30

1 Answers1

9

In the specific case of std::vector on GNU GCC's STL implementation (version 4.6.1), I don't think there would be a performance difference on sufficiently high optimization levels.

The implementation for forward iterators on vector is provided by __gnu_cxx::__normal_iterator<typename _Iterator, typename _Container>. Let's look at its constructor and postfix ++ operator:

  explicit
  __normal_iterator(const _Iterator& __i) : _M_current(__i) { }

  __normal_iterator
  operator++(int)
  { return __normal_iterator(_M_current++); }

And its instantiation in vector:

  typedef __gnu_cxx::__normal_iterator<pointer, vector> iterator;

As you can see, it internally performs a postfix increment on an ordinary pointer, then passes the original value through its own constructor, which saves it to a local member. This code should be trivial to eliminate through dead value analysis.

But is it optimized really? Let's find out. Test code:

#include <vector>

void test_prefix(std::vector<int>::iterator &it)
{
    ++it;
}

void test_postfix(std::vector<int>::iterator &it)
{
    it++;
}

Output assembly (on -Os):

    .file   "test.cpp"
    .text
    .globl  _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .type   _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, @function
_Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE:
.LFB442:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    movl    8(%ebp), %eax
    addl    $4, (%eax)
    popl    %ebp
    .cfi_def_cfa 4, 4
    .cfi_restore 5
    ret
    .cfi_endproc
.LFE442:
    .size   _Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, .-_Z11test_prefixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .globl  _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .type   _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, @function
_Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE:
.LFB443:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    movl    8(%ebp), %eax
    addl    $4, (%eax)
    popl    %ebp
    .cfi_def_cfa 4, 4
    .cfi_restore 5
    ret
    .cfi_endproc
.LFE443:
    .size   _Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE, .-_Z12test_postfixRN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEE
    .ident  "GCC: (Debian 4.6.0-10) 4.6.1 20110526 (prerelease)"
    .section    .note.GNU-stack,"",@progbits

As you can see, exactly the same assembly is output in both cases.

Of course, this may not necessarily be the case for custom iterators, or more complex data types. But it appears that, for vector specifically, prefix and postfix (without capturing the postfix return value) have identical performance.

bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • Nice analysis. I expect that any good optimizing compiler will perform the same. – Mark Ransom Jun 21 '11 at 02:21
  • Thanks for that answer. I suspect @Mark is right, though I'm curious about other compilers. – Dominic Gurto Jun 21 '11 at 02:32
  • Feel free to test with other compilers as well :) – bdonlan Jun 21 '11 at 02:35
  • Is this a valid test, though? Are you sure you're not just seeing the optimizer replace `it++` with `++it` when it sees that the result is ignored? This would seem to be a good optimization to make, considering the widespread use of the postfix `++` in `for` loops. (Don't laugh; I've seen the optimizer replace `printf("foo\n");` with `puts("foo");`.) – Mike DeSimone Jun 21 '11 at 02:44
  • 1
    That is precisely what I'm seeing and what I'm testing for :) Obviously if you actually store the return value of `it++` somewhere it has to do some extra work to do this; the same would apply for `x = ++it`. The question is, is the optimizer smart enough to not do the extra work when it's not needed. – bdonlan Jun 21 '11 at 02:50