6

C++ so far (unfortunately) doesn't support finally clause for a try statement. This leads to speculations on how to release resources. After studying the question on the internet, although I found some solutions, I didn't get clear about their performance (and I would use Java if performance didn't matter that much). So I had to benchmark.

The options are:

  1. Functor-based finally class proposed at CodeProject. It's powerful, but slow. And the disassembly suggests that outer function local variables are captured very inefficiently: pushed to the stack one by one, rather than passing just the frame pointer to the inner (lambda) function.

  2. RAII: Manual cleaner object on the stack: the disadvantage is manual typing and tailoring it for each place used. Another disadvantage is the need to copy to it all the variables needed for resource release.

  3. MSVC++ specific __try / __finally statement. The disadvantage is that it's obviously not portable.

I created this small benchmark to compare the runtime performance of these approaches:

#include <chrono>
#include <functional>
#include <cstdio>

class Finally1 {
  std::function<void(void)> _functor;
public:
  Finally1(const std::function<void(void)> &functor) : _functor(functor) {}
  ~Finally1() {
    _functor();
  }
};

void BenchmarkFunctor() {
  volatile int64_t var = 0;
  const int64_t nIterations = 234567890;
  auto start = std::chrono::high_resolution_clock::now();
  for (int64_t i = 0; i < nIterations; i++) {
    Finally1 doFinally([&] {
      var++;
    });
  }
  auto elapsed = std::chrono::high_resolution_clock::now() - start;
  double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
  printf("Functor: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}

void BenchmarkObject() {
  volatile int64_t var = 0;
  const int64_t nIterations = 234567890;
  auto start = std::chrono::high_resolution_clock::now();
  for (int64_t i = 0; i < nIterations; i++) {
      class Cleaner {
        volatile int64_t* _pVar;
      public:
        Cleaner(volatile int64_t& var) : _pVar(&var) { }
        ~Cleaner() { (*_pVar)++; }
      } c(var);
  }
  auto elapsed = std::chrono::high_resolution_clock::now() - start;
  double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
  printf("Object: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}

void BenchmarkMSVCpp() {
  volatile int64_t var = 0;
  const int64_t nIterations = 234567890;
  auto start = std::chrono::high_resolution_clock::now();
  for (int64_t i = 0; i < nIterations; i++) {
    __try {
    }
    __finally {
      var++;
    }
  }
  auto elapsed = std::chrono::high_resolution_clock::now() - start;
  double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
  printf("__finally: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}

template <typename Func> class Finally4 {
  Func f;
public:
  Finally4(Func&& func) : f(std::forward<Func>(func)) {}
  ~Finally4() { f(); }
};

template <typename F> Finally4<F> MakeFinally4(F&& f) {
  return Finally4<F>(std::forward<F>(f));
}

void BenchmarkTemplate() {
  volatile int64_t var = 0;
  const int64_t nIterations = 234567890;
  auto start = std::chrono::high_resolution_clock::now();
  for (int64_t i = 0; i < nIterations; i++) {
    auto doFinally = MakeFinally4([&] { var++; });
    //Finally4 doFinally{ [&] { var++; } };
  }
  auto elapsed = std::chrono::high_resolution_clock::now() - start;
  double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
  printf("Template: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}

void BenchmarkEmpty() {
  volatile int64_t var = 0;
  const int64_t nIterations = 234567890;
  auto start = std::chrono::high_resolution_clock::now();
  for (int64_t i = 0; i < nIterations; i++) {
    var++;
  }
  auto elapsed = std::chrono::high_resolution_clock::now() - start;
  double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
  printf("Empty: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}

int __cdecl main() {
  BenchmarkFunctor();
  BenchmarkObject();
  BenchmarkMSVCpp();
  BenchmarkTemplate();
  BenchmarkEmpty();
  return 0;
}

The results on my Ryzen 1800X @3.9Ghz with DDR4 @2.6Ghz CL13 were:

Functor: 175148825.946 Ops/sec, var=234567890
Object: 553446751.181 Ops/sec, var=234567890
__finally: 553832236.221 Ops/sec, var=234567890
Template: 554964345.876 Ops/sec, var=234567890
Empty: 554468478.903 Ops/sec, var=234567890

Apparently, all the options except functor-base (#1) are as fast as an empty loop.

So is there a fast and powerful C++ alternative to finally, which is portable and requires minimum copying from the stack of the outer function?

UPDATE: I've benchmarked @Jarod42 solution, so here in the question is updated code and output. Though as mentioned by @Sopel, it may break if copy elision is not performed.

UPDATE2: To clarify what I'm asking for is a convenient fast way in C++ to execute a block of code even if an exception is thrown. For the reasons mentioned in the question, some ways are slow or inconvenient.

Serge Rogatch
  • 13,865
  • 7
  • 86
  • 158
  • 20
    Sure, RAII. Use types that clean up themselves and no matter how the scope is exited the resources are cleaned up. – NathanOliver Jun 13 '17 at 11:55
  • @NathanOliver, RAII is in option #2, function `BenchmarkObject()`. I've listed its disadvantages: mainly that it takes substantial memory on the stack and requires copying from the stack of the outer function. – Serge Rogatch Jun 13 '17 at 11:57
  • 1
    This is just pure speculation, but one of the reasons that C++ doesn't have a `finally` clause *could* be that exceptions in C++ are expensive when thrown, and therefore should only be used for truly exceptional cases. That of course leads to `try-catch` blocks being uncommon, and mostly used to do some error reporting and then rethrowing the exception so the application terminates. Which means there's really no use for a `finally` clause. This is unlike other languages where exceptions are the normal error-handling function. – Some programmer dude Jun 13 '17 at 11:58
  • @SergeRogatch RAII compliant object don't have to take too much memory on the stack (cf smart pointer) and the copy issue can be resolved via copy elision and move semantic. – nefas Jun 13 '17 at 11:59
  • 1
    Wait, I don't get it. If your benchmark showed that "all the operations except functor-base are as fast as an empty loop", then this disproved your assumption that RAII would be slow become of some kind of "copying overhead", which means that there is no question here. Use RAII like everyone else who programs in C++. You don't need to be on the lookout for an "alternative to `finally`". – Cody Gray - on strike Jun 13 '17 at 11:59
  • @SergeRogatch I don't mean make `finally` RAII but make all the types you use RAII. Like lets say you're using `int foo* = new int[some_num]; int bar* = new[some_num];` we can replace that with `std::unique_ptr` and if an exception is raised they will be cleaned up automatically. You don't have to do anything. All it cost is a destructor, which often times is minimal if anything. – NathanOliver Jun 13 '17 at 12:01
  • I think Serge uses the finally-functor as RAII option. With RAII it's meant that you don't have to care about resource cleanup at all. So do not use the functor but use self cleaning types. – mattideluxe Jun 13 '17 at 12:01
  • As an addendum about the talk about RAII, with proper use of it and the standard containers and smart pointers etc., one could also follow [the rule of zero](http://en.cppreference.com/w/cpp/language/rule_of_three#Rule_of_zero) which will make life much easier in general and also have the effect that variables with automatic storage duration (a.k.a. local variables) will simply be cleaned up nicely when they get out of scope. Which means you can use blocks and scoping as a way of handling `finally`, as in `{ SomeType a; try { ... } catch { ... } /* automatic "finally" of variable a */ }` – Some programmer dude Jun 13 '17 at 12:06
  • you can find some info in [this video from Andrei Alexandrescu (cppcon 2015)](https://www.youtube.com/watch?v=WjTrfoiB0MQ). It explain how to create callback which are called when you go out of scope, when an exception is raised, when no exception is raised. – nefas Jun 13 '17 at 12:12
  • @CodyGray, option #2 stack and copying overhead is not accute in my example because the `Cleaner` does not need here a lot of variables from the outer function. But in practice it needs several: at least the array pointer and the number of objects in the array, so to call their destructors explicitly before returning the memory as `void*` to a memory pool. And I need multiple cleaners for multiple resources allocated at different stages within a function. – Serge Rogatch Jun 13 '17 at 12:14
  • 1
    @Someprogrammerdude: Not all `finally` handles really resource, some might restore state, and creating RAII class for each such case would just repeat a pattern which can be factorized by `finally`. – Jarod42 Jun 13 '17 at 12:31

2 Answers2

12

You can implement Finally without type erasure and overhead of std::function:

template <typename F>
class Finally {
    F f;
public:
    template <typename Func>
    Finally(Func&& func) : f(std::forward<Func>(func)) {}
    ~Finally() { f(); }

    Finally(const Finally&) = delete;
    Finally(Finally&&) = delete;
    Finally& operator =(const Finally&) = delete;
    Finally& operator =(Finally&&) = delete;
};

template <typename F>
Finally<F> make_finally(F&& f)
{
    return { std::forward<F>(f) };
}

And use it like:

auto&& doFinally = make_finally([&] { var++; });

Demo

Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • make sure to wrap the destructor in `try/catch` as `f` may throw, or do SFINAE magic to choose the destructor overload based on the `noexcept` specifier – David Haim Jun 13 '17 at 12:22
  • @DavidHaim: In fact it is more complicated, see [finally-scopeexit](https://stackoverflow.com/documentation/c%2b%2b/1320/raii-resource-acquisition-is-initialization/4551/finally-scopeexit#t=201706131223351141309) and its variance [ScopeFailed](https://stackoverflow.com/documentation/c%2b%2b/1320/raii-resource-acquisition-is-initialization/18993/scopefail-c17#t=201706131223351141309) [ScopeSuccess](https://stackoverflow.com/documentation/c%2b%2b/1320/raii-resource-acquisition-is-initialization/18992/scopesuccess-c17#t=201706131223351141309) – Jarod42 Jun 13 '17 at 12:26
  • 1
    Or C++17 style `Finally do_finally { [&]{++var;} }`. – MSalters Jun 13 '17 at 13:23
  • Neother `auto doFinally = make_finally([&] { var++; });`, nor `Finally do_finally { [&]{++var;} }` compiles in MSVC++2017 for me. Did you mean `F` and `Func` to be the same in class `Finally`? – Serge Rogatch Jun 13 '17 at 16:36
  • 2
    won't this break if copy elision is not performed (pre c++17)? – Sopel Jun 13 '17 at 16:53
  • Sopel Yes! In this case, `Finally` should have a boolean flag that allows it to be disabled, and moving from an object should disable it. – Arne Vogel Jun 13 '17 at 17:09
  • @Sopel, yes, this involves copy constructor: adding `Finally(const Finally&) = delete;` breaks compilation. But without this, it may break at run time. – Serge Rogatch Jun 13 '17 at 17:46
  • I implemented a type erasing function wrapper that works with move-only function objects (very similar to `std::function`). The main overhead of `std::function` is probably from memory allocation because it uses the heap for storage. The standard allows for small object optimization though and even seems to mandate it for function pointers and reference wrappers. – Arne Vogel Jun 13 '17 at 17:51
  • Serge Rogatch, it's not about copying vs. moving though. The implicitly generate move constructor would not be better. E.g. `Finally(Finally&&) = default;` fixes the compile error but not the behavior. – Arne Vogel Jun 13 '17 at 17:53
  • @SergeRogatch: Fix sample to have expected behavior even without guaranteed copy elision from C++17. – Jarod42 Jun 14 '17 at 07:32
  • @Jarod42 , this still does not compile for me. Please, also check `template class Finally` and `template Finally(Func&& func)`. I guessed that you meant to just use `F` for the constructor too. – Serge Rogatch Jun 14 '17 at 12:26
  • Serge Rogatch: You have to use `auto&&` or `const auto &` with `make_finally` (at least pre-17). – Arne Vogel Jun 14 '17 at 13:57
  • I've found that whether it compiles or not depends on whether `return { std::forward(f) };` or `return Finally{ std::forward(f) };` is used. So isn't is compiler bug? – Serge Rogatch Jun 14 '17 at 18:50
  • There is a subtle difference indeed, the later doesn't make a copy/move but construct the return object "in place". – Jarod42 Jun 14 '17 at 19:25
0

Well, it's your benchmark that's broken: It does not actually throw, so you only see the non-exception path. This is quite bad as the optimizer can prove that you don't throw, so it can throw away all code that actually handles performing cleanup with an exception in flight.

I think, you should repeat your benchmark, putting a call to exceptionThrower() or nonthrowingThrower() into your try{} block. These two functions should be compiled as a separate translation unit, and only linked together with the benchmark code. That will force the compiler to actually generate exception handling code irrespective of whether you call exceptionThrower() or nonthrowingThrower(). (Make sure that you don't switch on link time optimizations, that could spoil the effect.)

This will also allow you to easily compare the performance impacts between the exception and the non-throwing execution paths.


Apart from the benchmark issues, exceptions in C++ are slow. You'll never get hundreds of millions of exceptions thrown within a second. It's more around single digit millions at best, likely less. I expect that any performance differences between different finally implementations are entirely irrelevant in the throwing case. What you can optimize is the non-throwing path, where your cost is simply the construction/destruction of your finally implementation object, whatever that is.

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106
  • It is very important how successful optimizer is with such `finally`: so if it decides not to copy variables and and to inline the releasing function - that's very good for performance. And of course non-exceptional scenario is much more important for performance than exceptional. My benchmark may be not that good, but it's also not totally bad: because the variables is `volatile`, the compiler can't throw away its increment, which I do in the releasing function. – Serge Rogatch Jun 14 '17 at 12:06
  • @SergeRogatch Ah, I didn't see that `volatile`. That does indeed take care of the issue with the finally body quite nicely. However, the issue with exception generation remains: There is a difference between compiling code that is known not to throw, and compiling code that is not known not to throw. I'll now edit my answer to reflect the `volatile` correctly. – cmaster - reinstate monica Jun 14 '17 at 12:31
  • @SergeRogatch I've now updated the answer. – cmaster - reinstate monica Jun 14 '17 at 12:38