Performance of repetitive indirection

Question

I find myself debating whether I want to write like Code 1 vs Code 2. In my opinion, Code 1 looks cleaner, but in theory, can I expect a performance penalty due to its extra indirections compared to Code 2? Are there any relevant compiler optimizations here? Does anything change if bar() returns a Bar*?

Code 1:

foo.bar().method1();
foo.bar().method2();
foo.bar().method3();
foo.bar().method4();

Code 2:

Bar& bar = foo.bar(); //Java programmers: ignore ampersand
bar.method1();
bar.method2();
bar.method3();
bar.method4();

EDIT: I think there are too many variables to ask such a general question (e.g. const vs non-const methods, whether the compiler inlines the methods, how the compiler treats the references etc). Analyzing my specific code in assembly is perhaps the way to go.

You need to ask if the potential performance gains are worth changing how readable the code is. You should confirm if there are meaningful performance gains with performance tests. — ajp15243, Dec 31 '14 at 19:44
I think number 2 looks far cleaner than number 1. Chaining method calls like that is ugly. — Daniel, Dec 31 '14 at 19:46
Well, `bar()` should return a `Bar&` to start. Some people get around this via "Method Chaining", i.e. all the methods return `*this`, so one could do `foo.bar().method1().method2().method3().method4()`, but I find it to be a kindof weird idiom. — IdeaHat, Dec 31 '14 at 19:46
@Daniel I think it depends on context. If there are potentially many foo's being juggled around, then you need a new name for each of the bar's. — Agrim Pathak, Dec 31 '14 at 19:52
@anon Of course. If it's hard to come up with a good descriptive name that differentiates between the different `bar`s, then you have a bad design and/or bad names. In fact, even if you only have one `foo` thing around, you should still give a name for `foo.bar()` that would be sufficiently descriptive if there _had been_ another `foo`. — Daniel, Dec 31 '14 at 19:55
@IdeaHat Right, the post got edited. I re-edited. This question isn't about method chaining. The methods could actually return values. — Agrim Pathak, Dec 31 '14 at 19:56
I don't see any difference except that you are helping the compiler in `Code 2` example. The compiler may cache the result of `foo.bar()` very similar to your `Code 2` example. Print out the assembly language listing of both (compiler optimizations turned on), then compare the two. — Thomas Matthews, Dec 31 '14 at 20:11
@ThomasMatthews I think it's quite likely that the compiler will not cache the result of `foo.bar()` since there might be side-effects. — Daniel, Dec 31 '14 at 20:18
@Daniel: In both cases, there can be side effects. "The truth lies in the assembly language." — Thomas Matthews, Dec 31 '14 at 20:22
@ThomasMatthews What I mean is, since `bar` could have side-effects, the compiler wouldn't optimize 4 calls to it down to 1 since that would change the meaning of the program. — Daniel, Dec 31 '14 at 20:23

score 1 · Answer 1 · answered Dec 31 '14 at 19:47

Code_1 seems to have performance penalty when compared to Code_2.

But remember the most basic rule of robust C++ designs :- Premature optimizations is the root of all evil. Make your code first for clarity then appoint a good profiler as your "Guru".

score 1 · Answer 2 · answered Dec 31 '14 at 19:49

1

The second option, Bar bar = foo.bar() is definitely more efficient, though how much so depends on how heavy weight bar is. The difference could very well be trivial; try benchmarking.

As for readability, I would argue that the second option is more readable, but this is getting into personal style. I think what you really want though is a method5 which calls all four methods internally. Thus you can have

foo.bar().method5();

And thats it.

answered Dec 31 '14 at 19:49

David says Reinstate Monica

19,209
22
79
122

method5() makes sense the way I presented the question, but the methods themselves may return useful values. – Agrim Pathak Dec 31 '14 at 19:57
I want to upvote your answer for the other points, but the `method5` idea is really not good in most situations. – Daniel Dec 31 '14 at 19:57
@Daniel In the situation presented in the problem (void returning) its a great solution; I've done it myself several times. If you still want to return the stuff from the first 4 methods (as anon mentioned) you can just throw it in a map or something – David says Reinstate Monica Dec 31 '14 at 20:02
It _could_ be a good solution if those 5 methods have something to do with each other and are frequently called in sequence because that sequence constitutes some macro operation. But if the function calls have nothing to do with each other, it doesn't make much sense to group them together in one function. What would you call it? Methods are supposed to do exactly one thing. If those five function calls together can be called one thing, it makes sense. Otherwise it doesn't – Daniel Dec 31 '14 at 20:08
@Daniel We're making massive speculations here without knowing anything about the code. Generally speaking though you can and do name everything from low to high level stuff. This is no different. Moreover, based on the context of this question, there does seem to be some relationship between the 4 - at the very least they are all performed on `bar`, which implies some sort of strong relationship – David says Reinstate Monica Dec 31 '14 at 20:11
@DavidGrinberg Fair enough. I will accept that response. [See here](http://stackoverflow.com/questions/27725660/performance-of-repetitive-indirection?noredirect=1#comment43864143_27725861) – Daniel Dec 31 '14 at 20:27

score 1 · Answer 3 · answered Dec 31 '14 at 20:03

1

Depending on what bar actually does, there may or may not be (noticable) performance penalty. A more interesting question is what makes you think that your first approach is "cleaner".

Without knowing any details of the implementation, I actually tend to think the opposite: the latter approach is not only shorter (short is good, because less code is less bugs, and less stuff to read), but also cleaner and more readable.

It clearly reflects intent of the author, and does not leave the reader wondering about specifics of implementation of bar and that could result in unanticipated side-effects, that in turn may or may not be intentional and/or desired.

Don't do it, unless you have a very good reason to.

answered Dec 31 '14 at 20:03

Dima

39,570
6
44
70

Hypothetically, I think something like `aircraft.cargo_door().open()` is more readable than something like `CargoDoor& door = aircraft.cargo_door(); door.open();`. – Agrim Pathak Dec 31 '14 at 20:21
1

@anon I disagree. I just think you have chosen a bad name (`door`). If you were to instead call it `aircraftCargoDoor`, I think that would be the best of both worlds. – Daniel Dec 31 '14 at 20:25
@anon But in your case, I think you have really given a good example _in favor_ of David Grinberg's solution. The `Aircraft` class should have an `OpenDoor` method. So it would be `aircraft.OpenDoor()`. Method names should be verbs, not nouns. – Daniel Dec 31 '14 at 20:26
1

@anon In this case, when you only use it once, I agree, it is more readable. But repeating it over and over, as in your original question, is a very different story. "Readability" in code is not only about comprehending written words, but also about understanding the implication. Does your `cargo_door()` method access the database every time it is called to load the properties (or, perhaps, to log "door access")? Does it open up a ui for the user to provide credentials? Is it thread safe? What happens, if the actual door is replaced by another client between calls? Etc. – Dima Dec 31 '14 at 20:29

score 0 · Accepted Answer · answered Jan 01 '15 at 11:58

Reference tests

I ran a simple test. When compiled with no optimizations, on my machine Test_1 took 1272 ms and Test_2 1108 (I ran the tests several times, results within a couple of ms). With O2/O3 optimizations, both tests appeared to take an equal amount of time: 946 ms.

    #include <iostream>
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #include <chrono>

    using namespace std;

    class Foo
    {
    public:
      Foo() : x_(0) {}
      void add(unsigned amt)
      {
        x_ += amt;
      }
      unsigned x_;
    };

    class Bar
    {
    public:
      Foo& get()
      {
        return foo_;
      }
    private:
      Foo foo_;
    };

    int main()
    {
      srand(time(NULL));
      Bar bar;
      constexpr int N = 100000000;
      //Foo& foo = bar.get(); //TEST_2
      auto start_time = chrono::high_resolution_clock::now();
      for (int i = 0; i < N; ++i)
      {
        bar.get().add(rand()); //TEST_1
        //foo.add(rand()); //TEST_2
      }
      auto end_time = chrono::high_resolution_clock::now();

      cout << bar.get().x_ << endl;
      cout << "Time: ";
      cout << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << endl;
    }

Pointer tests

I reran the tests, but this time with the class member being a pointer. When compiled with no optimizations, on my machine Test_3 took 1285-1340 ms and Test_4 1110 ms. With O2/O3 optimizations, both tests appeared to take an equal amount of time: 915 ms (surprisingly, less time than the reference tests above).

#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <chrono>

using namespace std;

class Foo
{
public:
  Foo() : x_(0) {}
  void add(unsigned amt)
  {
    x_ += amt;
  }
  unsigned x_;
};

class Bar
{
public:
  ~Bar()
  {
    delete foo_;
  }
  Foo* get()
  {
    return foo_;
  }
private:
  Foo* foo_ = new Foo;
};

int main()
{
  srand(time(NULL));
  Bar bar;
  constexpr int N = 100000000;
  //Foo* foo = bar.get(); //TEST_4
  auto start_time = chrono::high_resolution_clock::now();
  for (int i = 0; i < N; ++i)
  {
    bar.get()->add(rand()); //TEST_3
    //foo->add(rand()); //TEST_4
  }
  auto end_time = chrono::high_resolution_clock::now();

  cout << bar.get()->x_ << endl;
  cout << "C++ Time: ";
  cout << chrono::duration_cast<chrono::milliseconds>(end_time - start_time).count() << endl;
}

Conclusion

According to these simple tests on my machine, Code 2 style is slightly faster by an order of around ~15% when optimizations are not enabled, but with optimizations enabled, there is no difference in performance.

score 0 · Answer 5 · answered Jan 01 '15 at 15:42

I would argue that in most cases neither of the two alternatives is a good choice. Both expose internals via reference which breaks encapsulation. I think the level of abstraction of the Foo object is too low for its use and it should offer more service-like functions to do something with it.

Instead of

Bar& bar = foo.bar(); //Java programmers: ignore ampersand
bar.method1(); 
bar.method2(); 
bar.method3(); 
bar.method4();

There should be some higher-level method in Foo

class Foo
{
public:
    void doSomething()
    {
        Bar& bar = foo.bar(); //Java programmers: ignore ampersand
        bar.method1(); 
        bar.method2(); 
        bar.method3(); 
        bar.method4();
    }

private:
    Bar& bar();
};

You should never give non-const references to internals to clients.

Performance of repetitive indirection

5 Answers5