12

Suppose I want to write something a function such that:

  • it returns an object
  • under certain circumstances, depending on a parameter of the function, the object has a fixed value that can be calculated only once to save time. So the natural choice is to make that object static.
  • otherwise, the function must generate the object on the fly

What is the best to write this function, with the requirements that:

  • no copy constructors are required upon call, only move, to prevent copying all the data of an expensive object
  • I don't want to use new raw pointers. If pointers are required, they must be smart and auto delete

The use case is the following:

  • certain values are very common, so I want to cache them
  • other values are very rare, so I want them not to be cached

The caller should not know which values are common, the interface should be transparent for both cases.

So far, the only clean implementation that I've managed is to use shared_ptr as shown below, but it feels like overkill. In particular, because it makes a heap allocation, where it feels that one is not really required. Is there a better approach?

#include <cassert>
#include <iostream>
#include <memory>

struct C {
    int i;
    static int count;
    C(int i) : i(i) {
        std::cout << "constr" << std::endl;
        count++;
    }
    C(const C& c) : C(c.i) {
        std::cout << "copy" << std::endl;
    }
    ~C() {
        std::cout << "destr" << std::endl;
        count--;
    }
};
int C::count = 0;

std::shared_ptr<C> func_reg_maybe_static(int i) {
    static auto static_obj = std::make_shared<C>(0);
    if (i == 0) {
        return static_obj;
    } else {
        return std::make_shared<C>(i);
    }
}

int main() {
    assert(C::count == 0);

    {
        auto c(func_reg_maybe_static(0));
        assert(c->i == 0);
        assert(C::count == 1);
    }
    assert(C::count == 1);

    {
        auto c(func_reg_maybe_static(0));
        assert(c->i == 0);
        assert(C::count == 1);
    }
    assert(C::count == 1);

    {
        auto c(func_reg_maybe_static(1));
        assert(c->i == 1);
        assert(C::count == 2);
    }
    assert(C::count == 1);

    {
        auto c(func_reg_maybe_static(2));
        assert(c->i == 2);
        assert(C::count == 2);
    }
    assert(C::count == 1);
}

Which I compile with GCC 6.4.0:

g++ -std=c++17 -Wall -Wextra -pedantic-errors -o main.out func_ret_maybe_static.cpp

and it produces the expected output (guaranteed by copy elision I believe):

constr
constr
destr
constr
destr
destr

I've added the static reference counter C::count just to check that objects are actually getting deleted as expected.

If I didn't have the static case, I would just do directly:

C func_reg_maybe_static(int i) {
    return C(i);
}

and copy elision / move semantics would make everything efficient.

However, if I try something analogous as in:

C func_reg_maybe_static(int i) {
    static C c(0);
    return c;
}

then C++ smartly stops just moving C, and starts to copy it, to avoid corrupting the static.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • 1
    Not quite clear what else you want - if you return object by value you either have to move or copy it. You can move automatic but must copy static object obviously. Or use smart pointer. – Slava Nov 14 '18 at 16:34
  • @Slava any better way than `shared_ptr` as I've done, or is this a valid use of `shared_ptr`? – Ciro Santilli OurBigBook.com Nov 14 '18 at 16:36
  • 1
    Any use of `std::shared_ptr` is valid unless you abuse it. I think in this case it is most straightforward and easy to understand solution. – Slava Nov 14 '18 at 16:42
  • @Slava You can return a reference to a static-local, no need for `shared_ptr` – Lightness Races in Orbit Nov 14 '18 at 16:46
  • 1
    @LightnessRacesinOrbit you can return a reference to a static, but you cannot return a reference to an automatic, and it seems you need both a static and an automatic here. – n. m. could be an AI Nov 14 '18 at 17:20
  • @n.m. Indeed, turns out the OP requires a single function (this information was only recently provided) – Lightness Races in Orbit Nov 14 '18 at 17:20
  • Statements: `the object has a fixed value that can be calculated only once to save time` and `no copy constructors are required upon call, only move, to prevent copying all the data of an expensive object` seem to counter each other. – Sahil Dhoked Nov 14 '18 at 18:48
  • If you do not change value of C after the return, you can return a reference to C, which does not call any copy constructor – Sahil Dhoked Nov 14 '18 at 18:49

3 Answers3

6

I don't really understand the purpose of the static/automatic stuff. If all you're trying to do is avoid repeated constructions when the argument has been used before, why not just have a cache instead?

C& func_reg(const int i)
{
    static std::unordered_map<int, C> cache;
    auto it = cache.find(i);
    if (it == cache.end())
      it = cache.emplace(i, C(i));

    return it->second;
}

Don't always want the cache? Fine! Add this:

C func_reg_nocache(const int i)
{
    return C(i);
}

If I've misunderstood and you really need this thing where passing is_static == true gives you an entirely different object (one constructed with 0 rather than i) then just make a new function for that; it's doing something distinct.

C& func_reg()
{
    static C obj(0);
    return obj;
}

C func_reg(const int i)
{
    return C(i);
}
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 1
    There's nothing left with automatic storage then, right? – YSC Nov 14 '18 at 16:35
  • @YSC: Nope but I'm wilfully dropping that requirement because I don't really understand its value. Speaking functionally, that is. OP does allow for dynamic allocation as long as things are cleaned up automatically. – Lightness Races in Orbit Nov 14 '18 at 16:35
  • From OP's last comment, I fully agree. +1. – YSC Nov 14 '18 at 16:36
  • I'm tempted to make a new requirement saying that I don't want to cache certain values. I'm sure there are cases where this is a reasonable requirement? E.g. all cases are very rare except a few very common ones. – Ciro Santilli OurBigBook.com Nov 14 '18 at 16:38
  • 1
    @CiroSantilli新疆改造中心六四事件法轮功 Maybe but now we're really in the realm of me not understanding what problem you're trying to solve, sorry. – Lightness Races in Orbit Nov 14 '18 at 16:39
  • @CiroSantilli新疆改造中心六四事件法轮功 You can add a flag "const bool useCache = true" trivially and set it to false to skip that – Lightness Races in Orbit Nov 14 '18 at 16:45
  • 1
    You made me realize that I had not mapped my problem well to the question. I have edited the question to reflect that, hopefully this is the right approach in this case rather than asking a new one. – Ciro Santilli OurBigBook.com Nov 14 '18 at 17:09
  • @CiroSantilli新疆改造中心六四事件法轮功 Thanks, that's a vast improvement to the question. The added information rules out my solution, unfortunately. – Lightness Races in Orbit Nov 14 '18 at 17:11
4

The returned object may refer to a preexisting object depending on an opaque criteria. This implies that the function needs to return by reference. But since the storage for an uncached value must outlive the function and not be on the heap, the caller needs to preemptively provide the storage on the stack for the returned object.

C& func_reg_maybe_static(int i, void* buf)
{
    static C c(0);
    if(i == 0)
        return c;
    else
        return *new (buf) C(i);
}

using uninit_C = std::aligned_storage<sizeof(C), alignof(C)>;
uninit_C buf;
auto& c = func_reg_maybe_static(i, &buf);

This requires a manual destructor call further down the line if and only if the destructor has side effects. Another take would be

C& func_reg_maybe_static(int i, C& buf)
{
    static C c(0);
    if(i == 0)
        return c;
    else
    {
        buf.~C();
        return *new (&buf) C(i); // note below
    }
}

C buf;
auto& c = func_reg_maybe_static(i, buf);

This version obviously requires C be default constructible. The placement new is perfectly legal, but using buf after the function call may or may not be legal depending on the contents of C. The destructor will be called on scope exit.

Passer By
  • 19,325
  • 6
  • 49
  • 96
  • Why do we need `return *new (buf) C(i);`, and not just `return C(i)` in the else statement? – Sahil Dhoked Nov 14 '18 at 18:55
  • @SahilDhoked We return by reference. – Passer By Nov 14 '18 at 18:56
  • Cool, I didn't know about placement new. Can you clarify "The placement new is perfectly legal, but using buf after the function call may or may not be legal depending on the contents of C.": when is it now legal? – Ciro Santilli OurBigBook.com Nov 15 '18 at 04:59
  • @CiroSantilli新疆改造中心六四事件法轮功 Link added. – Passer By Nov 15 '18 at 07:12
  • 1
    @CiroSantilli新疆改造中心六四事件法轮功 And I'm sorry about the billion edits, I wasn't all that awake when I wrote this. – Passer By Nov 15 '18 at 07:21
  • I may be old fashioned or weird, but I honestly think it'd be better to pass `buf` by pointer. If only to make it truly explicit on the caller site that this object may be manhandled by the function. – StoryTeller - Unslander Monica Nov 15 '18 at 07:24
  • @StoryTeller I personally think its fine either way, as long as it's consistent. I'm more worried about the semantic weirdness for the second version. I was considering deleting the post but then I saw people like this apparently. – Passer By Nov 15 '18 at 07:27
  • I get what you mean by the semantic weirdness of the second version. I guess that's what makes me more partial towards pointers too – StoryTeller - Unslander Monica Nov 15 '18 at 07:29
3

What you want is essentially a variant:

std::variant<C, C*> func_reg_maybe_static(int i)
{
  static C static_obj{ 0 };
  if (i == 0) {
    return &static_obj;
  } else {
    return std::variant<C, C*>{ std::inplace_type_t<C>{}, i };
  }
}

This has the advantage of not requiring any additional memory allocations (compared to the shared_ptr approach). But it is somewhat inconvenient to use on the caller side. We could work around this by writing a wrapper class:

template <class C> class value_or_ptr
{
  std::variant<C, C*> object_;
public:
  explicit template <class... Params> value_or_ptr(Params&&... parameters)
    : object_(std::inplace_type_t<C>{}, std::forward<Params>(parameters)...)
  {}
  explicit value_or_ptr(C* object)
    : object_(object)
  {}
  // other constructors...

  C& operator*()
  {
    return object_.index() == 0 ? std::get<0>(object_) : *std::get<1>(object_);
  }

  // other accessors...
}

If you don't have a C++17 compiler, the same can be done with unions, but the implementation will of course be more complex. The original function then becomes:

value_or_ptr<C> func_reg_maybe_static(int i)
{
  static C static_obj{ 0 };
  if (i == 0) {
    return value_or_ptr{ &static_obj };
  } else {
    return value_or_ptr{ i };
  }
}

Contrast with Overloads

We could achieve similar results with function overloads:

C& func_reg()
{
  static C obj{ 0 };
  return obj;
}

C func_reg(const int i)
{
  return C{ i };
}

This has the advantage of being simpler but might require the caller to make a copy. It also requires the caller to know if a pre-computed object will be returned (which is undesirable). With the variant approach, the caller can treat the results uniformly. The caller always gets back an object with value semantics and does not need to know if he will receive a pre-computed object.

Peter Ruderman
  • 12,241
  • 1
  • 36
  • 58