Representing a value which can't be ctor-initialization-list initialized

Question

I'm writing a class C with a member foo of type foo_t. This member must be defined and valid throughout the liftime of a C instance; however, I don't have the necessary information to construct it at compile time, i.e. I can't use

class C {
    foo_t foo { arg };
}

Nor can I construct it when the ctor of C gets invoked, i.e. I can't have

class C {
    foo_t foo;
    C (whatever) : foo(compute_arg(whatever)) { }
}

rather, I am only able to construct it after some code within the C ctor has run. Edit: The reason for this may be that I need to run some code with side-effects (e.g. disk or network I/O) to obtain the construction argument; and I need that code to run also to be able to initialize other members, so I can't just call it several times as a free function from the initialization list.

So, how should I represent foo?

If foo_t can be default-constructed with some dummy/invalid/empty/null value, then I can let that happen, and be safe in the knowledge that it will never be accessed in that dummy state. Detriment: The declaration of foo in C does not indicate that it's always valid.
If foo_t only has a valid state, i.e. I can't construct it at all until I have the relevant information, then:
- I can use std::unique_ptr<foo_t>; initially it will be nullptr, then be assigned to. Detriments: No indication that it'll never be null after C() concludes; A useless allocation.
- I can use std::optional<foo_t>; Initially it will be nullopt, then be assigned to. Detriment: No indication that it'll never be empty after C() concludes; requires C++14; the word "optional" suggests that it's "optional" to have foo, while it isn't.

I'm more interested in the second case, since in the first case the ambiguity about a foo_t's validity is kind of built-in. Is there a better alternative to the two I mentioned?

Note: We can't alter foo_t.

@VittorioRomeo: one of possibilities is that `foo` constructor arguments are result of some computation, and results of the same computation can be required to initialise other members — Andriy Tylychko, Feb 16 '18 at 13:20
Let's suppose you're in the second case and that you chose the workaround "let's roll with std::unique_ptr". In your constructor's body, there'll be a moment where you write "new foo_t([arguments])" and [arguments] have been computed with the constructors arguments. Create a free function that map those constructor arguments to a tuple containing [arguments] then replace your constructor with one that calls another constructors with the exact same arguments + [arguments], initialize foo_t with it, and resume whatever computation you were doing with [arguments] in the constructor's body ? — Caninonos, Feb 16 '18 at 13:20
This _"member must be defined and valid throughout the lifetime of a C instance"_ and this _"Nor can I construct it when the ctor of C gets invoked"_ are mutually exclusive. — Eljay, Feb 16 '18 at 13:21
With the information you've provided there is not a better option that `optional`. But as you can see from all the downvotes, the hand-wavey statement: "only able to construct it within the ctor" is insufficient for anyone to make intelligent commentary on the question. (A singleton might be an option but how would I know since I don't know what prevents construction?) In any case, you put effort into this question I can tell, this is a natural oversight. Should you be willing to update the reason you can't construct there'd be an upvote in it for you. — Jonathan Mee, Feb 16 '18 at 13:24
There is almost certainly at least one way to rearrange things so that a constructor mem-initializer does correctly initialize `foo`. But this would be easier to explain if you gave more detail about the code inside the constructor that makes it difficult. — aschepler, Feb 16 '18 at 13:26
another not mentioned option would be to make `C` constructor private that takes more detailed info sufficient to create `foo` and whatever else, and to have a static `C Create(whatever_t)` method that does the required computation and passes the result to `C` constructor — Andriy Tylychko, Feb 16 '18 at 13:28
Here's something to illustrate what I said in my earlier comment, and probably what Andriy Tylychko said in his last comment: https://ideone.com/ubbbb7 . That kind of transformation seems always possible as long as your only way to obtain instances of foo_t isn't some kind of "mkfoo" function returning a `foo_t*` (with a non-copyable, non-movable foo_t). You may even avoid the tuple by introducing constructors declaring the local variables as arguments, and delegate to another constructor by passing them by reference + evaluating a function with mutates them as part of argument evaluation. — Caninonos, Feb 16 '18 at 13:42
@VittorioRomeo: I explained why in the sentence following the one you quoted. As a concrete example - perhaps I need to read something from a file in order to construct it. — einpoklum, Feb 16 '18 at 13:43
@einpoklum: that doesn't explain why - you can encapsulate the "file reading logic" in its own function and initialize `foo` in the initializer list by calling it — Vittorio Romeo, Feb 16 '18 at 13:44
@VittorioRomeo: Hmm. But what if the code which is required for the initialization of the `foo_t` is also used to initialize other values? And has some side-effects which make it undesirable to run it multiple times? i.e. what AndriyTylychko said. — einpoklum, Feb 16 '18 at 13:48
@Eljay: I mean, I need to run some code in the constructor body before I can initialize `foo`. After the ctor is done, the `foo` needs to be valid. — einpoklum, Feb 16 '18 at 13:52
@einpoklum: can you show the code that you need to run? I'm confident there's a way of refactoring it so that it can be used to initialize `foo` in the initialization list — Vittorio Romeo, Feb 16 '18 at 13:54
@VittorioRomeo: No, because in my concrete motivating case, and with your suggestion, I think I will be able to refactor some of the ctor code into a freestanding function and separate out the disk I/O side effects. So it's become a theoretical question. But you could make your comments into an answer, which while not covering any and all cases, has still been useful to me. — einpoklum, Feb 16 '18 at 13:57
@Caninonos: I think the suggestion you've linked to is pretty good. Can you make it an answer? — einpoklum, Feb 16 '18 at 14:11

Andriy Tylychko · Answer 1 · 2018-02-16T14:12:27.330

1

Let's consider a bit more specific case

struct C {
    foo_t foo1;
    foo_t foo2;
    C () : 
        foo1(read_from_file()),
        foo2(read_from_file()),
    { }

    static whatever_t read_from_file();
}

and let's assume it's not desired to read same data from the file twice.

One possible approach can be:

struct C {
    foo_t foo1;
    foo_t foo2;

    C(): C{Create()} {}

private:
    static C Create()
    {
        return C{read_from_file()};
    }

    C(whatever_t whatever):
        foo1{whatever},
        foo2{whatever}
    {}

    static whatever_t read_from_file();
}

Thanks to @VittorioRomeo for suggestions to improve it.

Wandbox

edited Feb 16 '18 at 14:12

answered Feb 16 '18 at 14:04

Andriy Tylychko

15,967
6
64
112

1

You should `return C{read_from_file()};` so that `read_from_file()` is passed to `C` as an rvalue. Also, just expose a public constructor that calls `Create` for you and make `Create` private – Vittorio Romeo Feb 16 '18 at 14:07
E.g. `C() : C(Create()) { }` – Vittorio Romeo Feb 16 '18 at 14:08

Caninonos · Answer 2 · 2018-02-16T16:44:21.163

In general if you can construct a foo_t in the constructor bodies of some class (without member initializer lists), then, you can modify your code so that your class now has a foo_t attribute and its constructors either delegate construction or construct it inside their member initializer lists.

Basically, in most cases, you can rewrite your problematic constructor so that it delegates to another constructor while providing it with necessary information to construct a foo_t instance in the member initializer list (which I quickly and informally illustrated in the comments with the following "example" https://ideone.com/ubbbb7 )

More generally, and if the tuple construction would happen to be a problem for some reason, the following transformation will (in general) work. It's admittedly a bit long (and ugly), but bear in mind it's for generality sake and that one could probably simplify things in practice.

Let's assume we have a constructor where we construct a foo_t, for the sake of simplicity, we'll further assume it to be of the following form :

C::C(T1 arg_1, T2 arg_2) {
    side_effects(arg_1, arg_2);
    TL1 local(arg_1, arg_2);
    second_side_effects(arg_1, arg_2, local);
    foo_t f(arg_1, arg_2, local); // the actual construction
    final_side_effects(arg_1, arg_2, local, f);
}

Where the function calls possibly mutate the arguments. We can delegate once to eliminate the declaration of local_1 in the constructor body, then once again to get rid of the call to second_side_effects(arg_1, arg_2, local).

C::C(T1 arg_1, T2 arg_2)
: C::C(arg_1, arg_2
      ,([](T1& a, T2& b){
          side_effects(a, b);
        }(arg_1, arg_2), TL1(a, b))) {}

C::C(T1& arg_1, T2& arg_2, TL1&& local)
: C::C(arg_1, arg_2
      ,[](T1& a, T2& b, TL1& c) -> TL1& {
          second_side_effects(a, b, c);
          return c;
      }(arg_1, arg_2, local)) {}

C::C(T1& arg_1, T2& arg_2, TL1& local) {
    foo_t f(arg_1, arg_2, local); // the actual construction
    final_side_effects(arg_1, arg_2, local, f);
}

live example

Clearly, f could be made an actual member of C and be constructed in the member initialization list of that last constructor.

One could generalize for any number of local variables (and arguments). I however assumed that our initial constructor didn't have any member initializer list. If it had one, we may have needed to either:

copy some of the initial arg_i's before they were mutated and pass the copies along the constructor chain so that they could ultimately be used to construct the other members in the member initializer list
preconstruct instances of the members and pass them along the constructor chain so that they could ultimately be used to move-construct the actual members in the member initializer list

The latter must be chosen if for some reason, the constructor of a member would have side effects.

There is however a case where this all falls apart. Let's consider the following scenario:

#include <memory>

struct state_t; // non copyable, non movable

// irreversible function that mutates an instance of state_t
state_t& next_state(state_t&);

struct foo_t {
    foo_t() = delete;
    foo_t(const foo_t&) = delete;
    foo_t(const state_t&);
};

// definitions are elsewhere

class C {
public:
    struct x_first_tag {};
    struct y_first_tag {};

    // this constructor prevents us from reordering x and y
    C(state_t& s, x_first_tag = {})
    : x(new foo_t(s))
    , y(next_state(s)) {}

    // if x and y were both constructed in the member initializer list
    // x would be constructed before y
    // but the construction of y requires the original s which will
    // be definitively lost when we're ready to construct x !
    C(state_t& s, y_first_tag = {})
    : x(nullptr)
    , y(s) {
        next_state(s);
        x.reset(new foo_t(s));
    }

private:
    std::unique_ptr<foo_t> x; // can we make that a foo_t ?
    foo_t y;
};

In that situation, I admittedly have no idea how to rewrite this class, but I deem it rare enough to not really matter.

I suggest simplifying this by just having one or two arguments for everything. It's just an illustration after all. You see, the introduction to your answer + the example from Andriy's answer work very well together. — einpoklum, Feb 16 '18 at 15:35

Representing a value which can't be ctor-initialization-list initialized

2 Answers2