1

Assuming:

const int n = bigNumber;
float* f = someData;

Is there any point in converting

for (int i = 0; i < n; i++)
{
    f[i] += 1;
}

to

float* lastF = f + n;
for (float* i = f; i < lastF; i++)
{
    *i += 1;
}

Naively looking at this, it seems I save an addition operation for every iteration (the f[i]).

Of course, this assumes I have no interest in the value of the indexer inside the loop.

  1. Am I right in this mindset?
  2. If so, is my compiler smart enough to do this on its own (assuming all optimization flags enabled)?

I would check the disassembly, but I am terrible at reading those.

Rotem
  • 21,452
  • 6
  • 62
  • 109
  • 2
    It indeed used to be faster, in C thirty years ago. I did lots of this back in the 1980s. I don't know any details about how well, or even if, Visual C++ will optimize this loop, but the general rule nowadays is that yes, modern compilers will optimize it for you. – Thomas Padron-McCarthy Jan 22 '15 at 20:22
  • 4
    The array index is absorbed into an indirect memory access. So it doesn't matter on most processors. But on Haswell processors, not all of the memory ports can handle complicated indirect memory access. So if you're in a situation where you need to saturate all 3 memory ports, one of them cannot be a pointer + index access. Compilers can't really be relied upon to do *this* extreme level of micro-optimization. They tend to favor indexing, so you may need to drop down to assembly to prevent the compiler from converting pointer increments to indexing. – Mysticial Jan 22 '15 at 20:24
  • @Mysticial Do you mean that the compiler may actually convert my second code to the first, making the code slower? Not sure what you meant by the last sentence. – Rotem Jan 22 '15 at 20:31
  • 1
    @Rotem Yes, it's possible. Compilers are fully capable of converting between the two versions. Which one they prefer depends on a number of factors including the size of the datatype. (And of course they don't always make the best decision.) If datatype is not a power-of-two, then they tend to prefer pointer increments since indirect addressing is trickier. In such cases, they also like to introduce a separate index which they increment by the size of the datatype with each iteration. – Mysticial Jan 22 '15 at 20:34

3 Answers3

5

No benefit whatsoever. Any compiler these days can do it better and faster than most programmers.

If f[i++] is more readable in your code than *fp++, don't think twice. They will be compiled to exactly the same object code.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • 3
    I've actually seen cases with contemporary compilers (e.g. `gcc`) that seem to generate better code with the `f[i++]` form than `*fp++`. I've never dug into it to figure out why (perhaps there is some arcane reason that governs what optimiziations can be made in each case), but I've found myself favoring the indexing method. – Jason R Jan 22 '15 at 20:23
  • 1
    @JasonR Indirect memory access is (usually) free on modern processors. So the index is free. If you have a loop where you're incrementing multiple pointers, each of them still needs to be incremented. Whereas with the index method, you only need to increment the index. – Mysticial Jan 22 '15 at 20:31
3

Modern compilers like MS Visual Studio are smart enough to convert the indexed form into the pointer form internally.

However, seeing that you are using MS Visual Studio, you better stick to the indexed form, because it better suits the MS Visual Studio's vectorizing optimizer.

The indexed form is the recommended one, which Visual Studio is more likely to understand. The pointer form is harder for the compiler to understand; if the compiler gets confused, it may be impossible for you to understand why.

For example, in my Visual Studio compiler (version 2013) the indexed form gets compiled into AVX code (256-bit registers like ymm0), while the pointer form uses SSE (128-bit registers like xmm0) - probably works slower by a factor of 2 (didn't measure it).

anatolyg
  • 26,506
  • 9
  • 60
  • 134
1

The difference is going to be too chaotic to have a general rule.

The two bits of code are logically equivalent, so under as-if the compiler is free to treat one as the other.

Things change if the value or address of i is taken, especially if it is passed "non locally", as that can force a compiler to give up the ability to transform one to the other.

Some compilers may be confused by one, but not the other.

The advantage of the first one is you are iterating over indexed elements.

The advantage of the second one is you are mimicing "modern" C++ iterator syntax. (I would, however, be tempted to use != rather than <, as comparing pointers past one-past-the-end of a structure is undefined behavior, so you should be certain to stop at the end, not blow past it).

The real problem with your question is that it is almost certainly pre-mature optimization. If it is not pre-mature optimization, you should be certain that this code is a performance bottleneck, and you should be profiling and determine which is faster rather than looking for rules of thumb.

At the level of "writing code that isn't prematurely pessimized", neither really wins. The importance of clarity trumps any performance difference you will experience. Clarity is important to performance because clear, easy to understand code is easier to optimize, and directed optimization (where you take a performance bottleneck and make it perform better) is a far better use of performance improvement time than reading over unclear code caused by unneeded micro optimizations.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • Thanks, I was indeed more interested in "writing code that isn't prematurely pessimized". – Rotem Jan 22 '15 at 21:28