13

Please let me begin with that I know it is a bad practice to call virtual functions from within a constructor/destructor. However, the behavior in doing so, although it might be confusing or not what the user is expecting, is still well defined.

struct Base
{
    Base()
    {
        Foo();
    }
    virtual ~Base() = default;
    virtual void Foo() const
    {
        std::cout << "Base" << std::endl;
    }
};

struct Derived : public Base
{
    virtual void Foo() const
    {
        std::cout << "Derived" << std::endl;
    }
};

int main(int argc, char** argv) 
{
    Base base;
    Derived derived;
    return 0;
}

Output:
Base
Base

Now, back to my real question. What happens if a user calls a virtual function from within the constructor from a different thread. Is there a race condition? Is it undefined? Or put it in other words. Is setting the vtable by the compiler, thread-safe?

Example:

struct Base
{
    Base() :
        future_(std::async(std::launch::async, [this] { Foo(); }))
    {
    }
    virtual ~Base() = default;

    virtual void Foo() const
    {
        std::cout << "Base" << std::endl;
    }

    std::future<void> future_;
};

struct Derived : public Base
{
    virtual void Foo() const
    {
        std::cout << "Derived" << std::endl;
    }
};

int main(int argc, char** argv) 
{
    Base base;
    Derived derived;
    return 0;
}

Output:
?
Language Lawyer
  • 3,378
  • 1
  • 12
  • 29
Gils
  • 478
  • 2
  • 9
  • 2
    The vtable itself is usually a static structure, but the vtable pointer needs to be set within the object. But of course these are implementation details that can't be relied on. – Mark Ransom Jun 09 '20 at 17:17
  • 6
    I see no reason to assume that this would be thread safe. – eerorika Jun 09 '20 at 17:17
  • 1
    Since the behavior of the async function changes when the constructor ends, the behavior depends on the timing of that function vs the timing of the constructor with no synchronization, which means there must be a race condition. A rule of the language is definitely broken here, but I'm not sure which one. It can't be as simple as a race on the vtable pointer, because the language does not recognize that pointer. It is an implementation detail. There must be a higher language concept or requirement that is violated instead. – François Andrieux Jun 09 '20 at 18:02
  • @FrançoisAndrieux There are two options: 1. There is a race condition that results in undefined behavior. 2. The standard guarantees that there will be no race condition while setting the vtable. I wonder which one is true. – Gils Jun 09 '20 at 18:06
  • 1
    @Gils First, there could be any number of other reasons causing UB that don't qualify as race condition. Second, the concept of vtable and vtable pointer are implementation details and are not mentioned or required by the language standard. So it is impossible for it to explicitly guarantee anything related to vtables. The guaranties about vtables have to be inferred from rules about polymorphism and other rules. – François Andrieux Jun 09 '20 at 18:10
  • @FrançoisAndrieux Sure. I totally agree with you. But I still wonder if my example results in UB or not and why. Is it legal or otherwise, like you said, which rule I broke? – Gils Jun 09 '20 at 18:14
  • @eerorika I believe that in order to create an undefined behavior, you need to break a language rule. Not the other way. So if the standard does not says anything about vtables, which rule did I break? In that case, I believe that the "compiler implementor" must make the vtable creation thread-safe. Otherwise, accessing the vtable from multiple threads (and at least one which writes) results in undefined behavior. – Gils Jun 09 '20 at 18:35
  • 2
    @Gils Here's your rule : `The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.` Unless you can find a rule that says that construnction of a class object is atomic, then you violate this rule. – eerorika Jun 09 '20 at 18:37
  • @eerorika But vtable is not part of the standard and I'm not modifying it explicitly. It is not part of my code, it is part of the code the compiler generates in order to implement polymorphism. I did not break that rule... the compiler did. – Gils Jun 09 '20 at 18:51
  • I've added relevant tags that may attract the attention of more knowledgeable users. – François Andrieux Jun 09 '20 at 19:22
  • @FrançoisAndrieux Thanks – Gils Jun 09 '20 at 19:27
  • The question and many comments are assuming that the vtable ptr changes between the two constructors. I wanted to point out, there's no reason to assume that. It's possible that the vtable ptr is set only once, and that dispatches from within constructors are merely static rather than dynamic. – Mooing Duck Jun 09 '20 at 19:41
  • 1
    Second, this question probably revolves more around "reading from an object whose lifetime is not yet started", which is absolutely 100% undefined behavior. – Mooing Duck Jun 09 '20 at 19:42
  • @MooingDuck Can you please point to a point in the code that breaks a rule and which rule is it? Maybe, this code is just fine... maybe there is nothing wrong with it... that's what we're trying to figure out. – Gils Jun 09 '20 at 19:46
  • 1
    @Gils: The object is a `Derived`, whose constructor may not yet completed when `[this] { Foo(); }` starts. That's definitely undefined behavior. The fact that the `Base` constructor is complete is irrelevant. – Mooing Duck Jun 09 '20 at 19:46
  • @MooingDuck the function is called from Base's constructor (not from Derived), on Base's pointer (this) after Base was successfully constructed. So is it Base's fault? Just a reminder that base is a stand-alone class. It can live just fine without Derived. For that matter, Derived might be added (implemented) much later after Base was. – Gils Jun 09 '20 at 19:51
  • After reading https://eel.is/c++draft/basic.life#def:lifetime, https://eel.is/c++draft/class.base.init, and https://eel.is/c++draft/class.cdtor several times, I think this is technically allowed. IMO it shouldnt' be, since `Base` has `virtual` methods and was therefore _desgned_ to be overridden. – Mooing Duck Jun 09 '20 at 20:33
  • 1
    @eerorika _Unless you can find a rule that says that construnction of a class object is atomic, then you violate this rule_ Unless you can find a rule that says that construction of a class object here has conflicting actions, [intro.races] doesn't apply. – Language Lawyer Jun 10 '20 at 00:21
  • Ultimately, my advice to the poster is to step back and re-evaluate what they are trying to do - regardless if this is or isn't practically thread safe. Look at the factory design pattern, I am sure that will substantially simplify this problem. – Jeremy Jul 03 '20 at 19:19

4 Answers4

4

First off a few excerpts from the standard that are relevant in this context:

[defns.dynamic.type]

type of the most derived object to which the glvalue refers [Example: If a pointer p whose static type is "pointer to class B" is pointing to an object of class D, derived from B, the dynamic type of the expression *p is "D". References are treated similarly. — end example]

[intro.object] 6.7.2.1

[..] An object has a type. Some objects are polymorphic; the implementation generates information associated with each such object that makes it possible to determine that object's type during program execution.

[class.cdtor] 11.10.4.4

Member functions, including virtual functions, can be called during construction or destruction. When a virtual function is called directly or indirectly from a constructor or from a destructor, including during the construction or destruction of the class's non-static data members, and the object to which the call applies is the object (call it x ) under construction or destruction, the function called is the final overrider in the constructor's or destructor's class and not one overriding it in a more-derived class. [..]

As you wrote, it is clearly defined how virtual function calls in the constructor/destructor work - they depend on the dynamic type of the object, and the dynamic type information associated with the object, and that information changes in the course of the execution. It is not relevant what kind of pointer you are using to "look at the object". Consider this example:

struct Base {
  Base() {
    print_type(this);
  }

  virtual ~Base() = default;

  static void print_type(Base* obj) {
      std::cout << "obj has type: " << typeid(*obj).name() << std::endl;
  }
};

struct Derived : public Base {
  Derived() {
    print_type(this);
  }
};

print_type always receives a pointer to Base, but when you create an instance of Derived you will see two lines - one with "Base" and one with "Derived". The dynamic type is set at the very beginning of the constructor so you can call a virtual function as part of the member initialization.

It is not specified how or where this information is stored, but it is associated with the object itself.

[..] the implementation generates information associated with each such object [..]

In order to change the dynamic type, this information has to be updated. This may be some data that is introduced by the compiler, but operations on that data are still covered by the memory model:

[intro.memory] 6.7.1.3

A memory location is either an object of scalar type or a maximal sequence of adjacent bit-fields all having nonzero width. [ Note: Various features of the language, such as references and virtual functions, might involve additional memory locations that are not accessible to programs but are managed by the implementation. — end note]

So the information associated with the object is stored and updated in some memory location. But that is were data races happen:

[intro.races]

[..]
Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.
[..]
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other [..]

The update of the dynamic type is not atomic, and since there is no other synchronization that would enforce a happens-before order, this is a data race and therefore UB.

Even if the update were to be atomic, you would still have no guarantee about the state of the object as long as the constructor has not finished, so there is no point of making it atomic.


Update

Conceptually it feels like the object takes on different types during construction and destruction. However, it has been pointed out to me by @LanguageLawyer that the dynamic type of an object (more precisely of a glvalue that refers to that object) corresponds to the most derived type, and this type is clearly defined and does not change. [class.cdtor] also includes a hint about this detail:

[..] the function called is the final overrider in the constructor's or destructor's class and not one overriding it in a more-derived class.

So even though the behavior of virtual function calls and the typeid operator is defined as if the object takes on different types, that is actually not the case.

That said, in order to achieve the specified behavior something in the state of the object (or at least some information associated with that object) has to be changed. And as pointed out in [intro.memory], these additional memory locations are indeed subject of the memory model. So I still stand by my initial assessment that this is a data race.

mpoeter
  • 2,574
  • 1
  • 5
  • 12
  • Nice answer, but there is still something that doesn't fit well in my mind. If there is an UB, what exactly causes it in the client code? If there is no Derived class, everything is well defined (because there is no change in the dynamic type). From the Derived implementor, nothing wrong as well, he just derived from Base... so innocent... So if we want to make a simple rule for such an UB, will it go like that: "If a base class constructor calls a virtual function, deriving from such a class will cause an UB. Is that what you're suggesting? – Gils Jun 10 '20 at 16:25
  • 2
    @Gils I think the correct perspective is to assume that if your type is polymorphic, the constructor will implicitly modify the object when it finishes. So the asynchronous function is always wrong to write, even if it happens to work if the dynamic type is the `Base` type. Because if you made your class polymorphic it implies that there is the intention to derive form it which will cause the asynchronous function to break. I guess the exception might be if you made the type `final`. Then I suppose you could use your asynchronous function during that type's construction. – François Andrieux Jun 10 '20 at 17:32
  • @FrançoisAndrieux Well... `final` and "base class" are kind of contradictions. I can't see any reason for someone to do that anyhow. But let me refine the suggested statement: "A polymorphic class should not call virtual member functions asynchronously in its constructor/destructor, doing so might result in an UB" – Gils Jun 10 '20 at 17:42
  • @Gils I believe that statement to be good advice and accurate based on what was shown so far. But in practice, in my opinion, it is probably not great design to start asynchronous functions that refer to `this` from the constructor, much less from the member initializer list. Even when it works and is legal, it is brittle and has a higher-than-normal chance to cause hard to find problems in the future. You need to be vigilant if you do use that design. – François Andrieux Jun 10 '20 at 17:50
  • @Gils Right, you won't add `final` to a base type, but any other polymorphic type can still experience this problem unless they are a leaf type (a polymorphic class that is not inherited from). In those cases, you can use `final` to make sure the problem doesn't occur, to prevent anyone from inheriting from it, triggering the problem. This is what I meant by the last part of my comment. – François Andrieux Jun 10 '20 at 18:00
  • @FrançoisAndrieux Thanks for the input. I would also love to hear more from you about why you think that making asynchronous calls from constructors are indications of bad design. You are already the second person who tells me that. I don't want to hijack the topic of this thread though and I also could not find private messages here :( – Gils Jun 10 '20 at 18:08
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/215693/discussion-between-francois-andrieux-and-gils). – François Andrieux Jun 10 '20 at 18:13
  • _dynamic type of the object_ Dynamic type is a property of an expression, not object. It is clearly written in the definition you quoted. _dynamic type changes in the course of the execution_ No, it doesn't. – Language Lawyer Jun 11 '20 at 00:31
  • Read the definition of [most derived object](https://timsong-cpp.github.io/cppwp/n4861/intro.object#6). – Language Lawyer Jun 11 '20 at 01:12
  • @LanguageLawyer of course a polymorphic object has a (dynamic) type: "An object has a type. Some objects are polymorphic; the implementation generates information associated with each such object that makes it possible to determine that object's type during program execution". I could rephrase to "the information associated with an object, which is used to determine that object's type, changes in the course of the execution". The term "the dynamic type of the object" is used multiple times in the standard.Can you elaborate how the definition of "most derived object" should help your point? – mpoeter Jun 11 '20 at 15:19
  • _The term "the dynamic type of the object" is used multiple times in the standard_ So? [It is known editorial issue](https://github.com/cplusplus/draft/issues/2190). _Can you elaborate how the definition of "most derived object" should help your point?_ Dynamic type is a property of an expression. But even if it were a property of an object, it couldn't change over time. most derived object is an object of most derived class type, and most derived class type doesn't change during construction. – Language Lawyer Jun 11 '20 at 15:58
  • _the implementation generates information associated with each such object that makes it possible to determine that object's type during program execution_ This part of [intro.object]/1 (and some other parts of it) is only formally normative, but really should be a [_Note:_ ... — _end note_]. So, your answer consist of: 1. "dynamic type" misuse/misunderstanding. 2. References to non-normative parts of the Standard such as Notes or things that should be Notes. There can't be UB because you "violate" some "rule" from a Note. – Language Lawyer Jun 11 '20 at 16:03
  • @LanguageLawyer thanks for the link, I wasn't aware of that issue. I think we can agree that _an object has a type_. Consider my example in the answer the expression (`typeid(*obj)`) and the value of `obj` is the same in both cases, yet the dynamic type _of the expression_ is different. So _something_ must have changed. _There can't be UB because you "violate" some "rule" from a Note_ - I do not fully agree. – mpoeter Jun 11 '20 at 16:46
  • [ISO/IEC Directives, Part 2](https://www.iso.org/sites/directives/current/part2/index.xhtml#_idTextAnchor321) states: _Notes are used for giving additional information intended to assist the understanding or use of the text of the document._ A good example is the note that _various features of the language, such as references and virtual functions, might involve additional memory locations_, which clearly indicates the intention that structures introduced by the compiler (like vtables) are indeed covered by the memory model, even though this is not explicitly stated in the _normative_ text. – mpoeter Jun 11 '20 at 16:46
  • _the dynamic type of the expression is different_ You're demonstrating that you still haven't understood the definition of dynamic type. For any expression `e` denoting some object, its dynamic type is always the same. Always! It doesn't depend on whether the object is polymorphic, has base classes, have they already been constructed etc. – Language Lawyer Jun 11 '20 at 16:51
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/215760/discussion-between-mpoeter-and-language-lawyer). – mpoeter Jun 11 '20 at 17:00
1

From: https://isocpp.org/wiki/faq/strange-inheritance#calling-virtuals-from-ctors

You can call a virtual function in a constructor, but be careful. It may not do what you expect. In a constructor, the virtual call mechanism is disabled because overriding from derived classes hasn’t yet happened. Objects are constructed from the base up, “base before derived”.

If the "construction phase" has not finished by the time your async function gets call it will call the calling object's function.

Is setting the vtable by the compiler, thread safe?

To my understanding, it is not thread safe, but no one should be modifying that memory location except the allocator and initializer

Carlos Ch
  • 393
  • 3
  • 11
  • 1
    By the time the async is called, Base is already well defined. From the Base perspective, it is legit to call its member functions. Base does NOT modify the vtable. It is actually Derived constructor which modifies it (implicitly by the compiler) – Gils Jun 09 '20 at 17:49
  • @interjay The current type that is being initialized, if we have animal->mammal->cat, and we call a virtual function in the constructor of mammal than we will call mammal's function and not the cat's. – Carlos Ch Jun 09 '20 at 17:59
  • @Gils for some reason it seems that 'future' is blocking until the thread exit's, I just tested the code in Visual Studio, by adding a sleep_for(1s). Might be something related to this: https://stackoverflow.com/questions/40030421/using-stdasync-in-constructor I need to read more on std::async and std::future – Carlos Ch Jun 09 '20 at 18:03
  • @CarlosCh In the linked question, `std::async` looks like it blocks because the `std::future` it returns is destroyed right away, and `std::future` will block on destruction until the associated function called with `std::async` returns or throws. It doesn't happen here because it is stored in a member and will only be destroyed in the class' destructor. – François Andrieux Jun 09 '20 at 18:05
  • @CarlosCh That's the expected behavior from a future returned from an async. – Gils Jun 09 '20 at 18:08
  • @interjay I believe that in order to create an undefined behavior, you need to break a language rule. Not the other way. So if the standard does not says anything about vtables, which rule did I break? In that case, I believe that the "compiler implementor" must make the vtable creation thread-safe. Otherwise, accessing the vtable from multiple threads (and at least one which writes) results in undefined behavior. – Gils Jun 09 '20 at 18:31
  • @Gils You are accessing an object in one thread while another thread might be modifying it. That's as undefined as `int i;` then `i++;` in one thread and `--i;` in another. This is a data race and is UB because it's not one of the data race cases with defined behavior. "The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior." – David Schwartz Jun 09 '20 at 18:46
  • 1
    @DavidSchwartz The vtable is not part of the standard and I'm not modifying it explicitly. It is not part of my code, it is part of the code the compiler generates in order to implement polymorphism. I did not break that rule, I did not create a race condition... the compiler did. (sorry for copy/pasting my comments from the other anwers) – Gils Jun 09 '20 at 18:55
  • @Gils Forget about the vtable. The constructor is modifying the object. The other thread is accessing the object. So you have one thread accessing an object while another thread may be modifying it. That's a data race and so unless one of the exceptions apply, it's UB. – David Schwartz Jun 09 '20 at 19:48
  • 1
    @DavidSchwartz That is just not true. Foo does not access anything which had not been already initialized (Except for the vtable of course - which I have no control over it). AFAIK, it is fully legal to access member variables that were already constructed from the constructor (even from a different thread). – Gils Jun 09 '20 at 19:57
  • @Gils Are you saying a constructor doesn't modify the value of an object? A constructor, pretty much by definition, gives an object its initial value. That is a modification of the value. Foo accesses the very same object whose value is still being changed by the constructor. Yes, accessing member variables is fine because member variables are distinct objects from the object they are members of. Here, one thread is modifying an object while another thread is accessing that very same object. (You must argue that either the constructor doesn't modify the object or the thread doesn't access it.) – David Schwartz Jun 09 '20 at 20:03
  • 2
    @DavidSchwartz I think there is a case to be made that that it's not clear that this constructor changes this object here in a way that can race with `Foo`. The only thing the constructor visible touches after `std::async` is called is the `future_` member which is not read by the `Foo` member function. I understand the impression that a constructor necessarily modifies an object, but it isn't always true for trivial constructor and that isn't the hurdle to pass in this case. For there to be a race, the change needs to happen where it could conflict with the spawned thread's accesses. – François Andrieux Jun 09 '20 at 20:08
  • @FrançoisAndrieux The constructor transitions the object from unconstructed to constructed. I don't see how you can argue that's not a change in the object's value. I don't agree that the change needs to happen where it could conflict with the spawned thread's accesses -- that's not what the standard says. The standard says it's a data race if an object is accessed while its value may be changing. – David Schwartz Jun 09 '20 at 21:06
  • @DavidSchwartz A data race is all about an object's value. So whether or not there is a data race depends on if "constructedness" is part of an object's value. I don't know if there is anything to support that beyond intuition. – François Andrieux Jun 09 '20 at 21:31
  • @DavidSchwartz I think "its value may be changing" means something actually changes the value. Foo doesn't change anything. It is totally fine to call a member functions in the constructor and let these functions access members which were already initialized. Actually, in this case, Foo doesn't even access anything. – Gils Jun 09 '20 at 21:31
  • @Gils Invoking a member function on an object accesses that object. That's why it's not legal if the object is not legal to access (say, after it's destroyed). The rule is that one thread may not access an object while another thread might be modifying it. I don't see how you can argue constructing an object doesn't modify it. – David Schwartz Jun 09 '20 at 22:00
  • @DavidSchwartz Please let me understand what you're saying. Forget about the Derived class and the virtuality of the function for a moment. Are you saying that it is UB to create a thread in a class constructor and make this thread access already initialized members in the class before the constructor ends? – Gils Jun 09 '20 at 22:07
  • @Gils Not in absolutely all cases. But if they race with the completion of the constructor, yes. The completion of the constructor changes the value of the object and it is not permissible to access an object while another thread might change the value of that same object -- calling a member function is an access to an object. This is why it is recommended to use a helper function to construct the object that launches threads *after* the constructor returns. – David Schwartz Jun 09 '20 at 22:10
  • @DavidSchwartz I'm not familiar with that recommendation. I'll be happy to have a reference for your claim if you have one. – Gils Jun 09 '20 at 22:20
  • [This article](https://rafalcieslak.wordpress.com/2014/05/16/c11-stdthreads-managed-by-a-designated-class/) offers the list of workarounds. – David Schwartz Jun 09 '20 at 22:29
  • @DavidSchwartz Thanks, I read it. Not really convinced though... I mean, using threads as members obviously might have drawbacks. But I don't think they result in UB if you make sure the thread does not access uninitialized members and make sure there is not concurrent access to members. I don't think that "this" pointer counts. After all, the "this" pointer is already initialized when you enter the constructor. Its value won't be changed untill the destructor is called. – Gils Jun 09 '20 at 22:36
1

I believe [class.base.init]/16:

Member functions (including virtual member functions) can be called for an object under construction. Similarly, an object under construction can be the operand of the typeid operator or of a dynamic_­cast. However, if these operations are performed in a ctor-initializer (or in a function called directly or indirectly from a ctor-initializer) before all the mem-initializers for base classes have completed, the program has undefined behavior.

should answer the question. However, it is defective. The fix would be

However, if these operations are performed in a ctor-initializer (or in a function called directly or indirectly from a ctor-initializer) before not after all the mem-initializers for base classes have completed, the program has undefined behavior.

Currently, the paragraph says that the behavior is undefined only if the invocation of a member function happens before mem-initializers for base classes have completed, but doesn't cover your case: when the invocation neither happens before base classes initialization completion nor base classes initialization completion happens before the invocation.

Language Lawyer
  • 3,378
  • 1
  • 12
  • 29
  • You are a real Lawyer. I'm still trying to figure out what you're trying to say :) – Gils Jun 10 '20 at 02:06
  • I'm trying really hard to understand this paragraph and I'm still not sure it is an exact match. If I understand correctly, [class.base.init]/16 says that it is legal to call a member function for an object under construction as long as its base classes completed. From Base perspective, there is nothing wrong. It has no base classes, so there is no prevention from async to call Foo. But then comes Derived which derives from Base. Derive doesn't know anything about Base's implementation. How can it create UB just by deriving from Base? – Gils Jun 10 '20 at 02:17
  • @Gils _How can it create UB just by deriving from Base?_ This is just a literal read of this paragraph :-). Even `struct B { B() { f(); } void f() {} }; struct D : B { D() : B() {} } d;` is UB from this paragraph's POV. `B::B()` function is called from `D::D()`'s _ctor-initializer_, where `B::f()` is called before _mem-initializer_ for `B` has completed. – Language Lawyer Jun 10 '20 at 02:21
  • Thanks for your help here... Honestly, I'm trying really hard. so to put it in other words, this paragraph says that you are allowed to call a member function from the constructor (object under construction) but you are not allowed to inherit from such an object. Did I get it right? – Gils Jun 10 '20 at 02:30
  • @Gils this looks like a defect. – Language Lawyer Jun 10 '20 at 02:30
  • @Gils _I'm still trying to figure out what you're trying to say_ Do you know https://timsong-cpp.github.io/cppwp/n4659/intro.races#def:happens_before, https://timsong-cpp.github.io/cppwp/n4659/intro.races#def:happens_after? – Language Lawyer Jun 10 '20 at 02:43
  • Is the lambda body really a function "called directly or indirectly from a _ctor-initializer_"? I'm not certain the paragraph even applies. "Under construction or destruction" is maybe even worse: these presumably have some time-like relationship to "initialization/destructor begins" and "initialization/destructor completed", but this is never precisely said. ([basic.life]/11 seems like it ought to help, except the phrases aren't even used together with the words "before" and "after".) – aschepler Jun 10 '20 at 03:07
  • @aschepler It is the best what I've found. _Is the lambda body really a function "called directly or indirectly from a ctor-initializer"?_ Why not? – Language Lawyer Jun 10 '20 at 03:12
  • 1
    The paragraph does not make the example in an earlier comment including `B() { f(); }` UB. The member function `f` is "called for an object" which has type `B` and is a subobject of another object of type `D`. It is not "called for" the object of type `D`. – aschepler Jun 10 '20 at 03:12
  • @aschepler _The member function f is "called for an object" which has type B and is a subobject of another object of type D. It is not "called for" the object of type D._ Hm. Makes sense. – Language Lawyer Jun 10 '20 at 03:14
  • My instinctive read is for "called from" to be the "ordinary" case where execution of the entire function body is tied in sequence to a full-expression in some other function body, as when [basic.exec]/11 talks about an evaluation "within" a function invocation. And "directly or indirectly" as meant just to handle cases like `f` calls `g` calls `h` chains. A signal handler, or the initial function of a thread, would not be "called from" any other function. I can see other meanings, but the complications of multithreading make "from" feel too simple. – aschepler Jun 10 '20 at 03:30
  • @aschepler Ok. But do you have an idea why the code in OP has UB? (Not to be read as "if you don't, then my answer is correct"). – Language Lawyer Jun 10 '20 at 03:33
  • I was thinking along the lines of: To paraphrase [basic.life]/4, essentially any use of any object is UB unless it's during the object's lifetime, or specifically covered by some other paragraph about properties of objects when not during their lifetimes. And [basic.life]/11 makes it clear "during its lifetime" involves "happens before/after", so that doesn't apply. The other cases in [basic.life] and [class.cdtor] at least imply some sort of time-like ordering as a precondition. ... – aschepler Jun 10 '20 at 03:54
  • ... If multithreading could mean that an evaluation doesn't qualify as **any** of the categories "before initialization starts", "under construction", "during lifetime", "under destruction", or "after destruction completes", then none of the specific allowances apply, and we're left with [basic.life]/4 - it's not during the object's lifetime, so it's UB. But this is hard to make a strong claim out of, since the exact meaning of some of the phrases is unclear in the presence of multiple threads. – aschepler Jun 10 '20 at 03:58
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/215629/discussion-between-language-lawyer-and-aschepler). – Language Lawyer Jun 10 '20 at 04:36
  • @aschepler Are you going to write an answer? – Language Lawyer Jun 15 '20 at 17:58
-1

Yes.

Strictly speaking, no. You probably can, with some effort, construct a malicious example which is, at least formally, not thread-safe. It will still be thread-safe in practice, though.

Other than trying to be outright maliciously obnoxious, and in particular in the context of your question, however, it's definitively yes.

An object is either not present at all or being constructed, or fully constructed. The only state that is somehow awkward is the being-constructed one.
It is sometimes discouraged to call virtual functions from constructors on partially-constructed objects, but it is entirely legitimate to do so, as long as one is aware of the implications. That is, the same object is different things at different times.

As for your concrete example: What you are doing is, you call a function, for which you capture this, which at that time refers to an object of type Base. There is not much doubt about what type of object the thread will use when it eventually gets to run because it must use that one copy of this which is just that, and nothing else.

The standard does not define exactly how things work with inheritance and vtables and all that (but that's merely unspecified, not Undefined Behavior). In practice, it's usually a pointer to a static structure that is being updated, and on most architectures (all reasonable architectures, for that matter) that's an atomic operation anyway.
So even if there needed to be some sort of atomicity, it would in practice be there anyway.

It isn't even needed, though. The capture happens from exactly one thread, which is the thread that is currently constructing the object, and there is only a single state the object could possibly be in, and there is no doubt about what it is.

Community
  • 1
  • 1
Damon
  • 67,688
  • 20
  • 135
  • 185
  • Thank you for your answer. But I'm not sure I fully understand what you're suggesting. Are you suggesting that Base::Foo will be called from the other thread? And there will be no polymorphism? just because the "this" pointer was captured while "this" pointed to Base? – Gils Jun 09 '20 at 20:09
  • 1
    There is also the state of "being destructed" during which the object's lifetime has already ended but which can still be used by the destructor in some ways. – François Andrieux Jun 09 '20 at 20:11
  • 2
    *"...and there is no doubt about what it is"* There is no doubt *at the time the capture happens* but that is not the part that is concerning about this question. There very much is a doubt when the spawned thread dereferences the captured pointer. It is unknown what dynamic type it might have at that time. – François Andrieux Jun 09 '20 at 20:14
  • @FrançoisAndrieux: You do realize that you're copying a `this` pointer, don't you? And you realize it's not like a `void*` of sorts. The dynamic type that the object may have after finishing construction has a _different_ `this` pointer (and thus a different type). The spawned thread will still have the "old" pointer, and thus only access to the valid base sub-object, and the object type will be (via that pointer) the one it was at the time of capture. There is absolutely no question as to what's what to whom. – Damon Jun 10 '20 at 10:31
  • Gils: There **is** polymorphism, but in the spawned thread it's always the same one choice (not "poly" as such) since you only have a single, non-changing `this` ptr. Non-static data or functions can only be accessed with a `this` pointer. This may be one out of many, depending on what state the object is in, and what the pointer was cast to. In this case, it's a pointer copied at the time when the object was a `Base`, so it is a `Base` object, not anything different (unless you up-cast it, but that would obviously mean very bad karma, it's technically possible, but then you do have UB). – Damon Jun 10 '20 at 10:42
  • Note that it is very possible (and even likely) that the `Base` is meanwhile a `Derived`, and it may even have valid data fields. But being viewed through a `Base*` that data doesn't exist at all (it does, but you pretend it doesn't). In the same way, the virtual functions _aren't_ the implementations of `Derived` because you're looking through a `Base*`, which is what you copied in the capture. – Damon Jun 10 '20 at 10:48
  • 1
    @Damon you ignore the fact that the captured pointer points to an object that is changed _after_ the pointer has been captured. And since you captured a pointer, you are about to access the _updated_ object - and it is absolutely _not_ clear in which state that object is at that point. Even if you look at it as `Base`, the fact that in the meantime the underlying object has become `Derived` _does_ matter. Suppose you have non-virtual function in `Base` that returns `typeid(this)`. The result would depend one the state of the object, regardless whether you look at it as `Base* ` or `Derived* `. – mpoeter Jun 10 '20 at 13:04
  • @Damon The behavior of that pointer in regard to `virtual` member functions will be different depending on if the constructor finished or not, and the address of the `Base` will not have changed. You can't capture, in a simple pointer, which of the two behavior you are supposed to get. For most purposes, `this` behaves as-if it had the dynamic type `Base` during construction and "becomes" more derived types as derived type constructors complete. – François Andrieux Jun 10 '20 at 13:18
  • @Damon GCC and Clang call Base::Foo while VS calls Derived::Foo. So whether there is a bug in any of them or it is indeed an UB. – Gils Jun 10 '20 at 14:45