3

In 2010, Bjarne Stroustrup, the creator of C++, wrote the paper “New” Value Terminology in which he explains the value categories of expressions introduced in the C++11 standard* (lvalue, xvalue, and prvalue, and their generalizations glvalue and rvalue):

There were only two independent properties:

  • “has identity” – i.e. and address, a pointer, the user can determine whether two copies are identical, etc.
  • “can be moved from” – i.e. we are allowed to leave to source of a “copy” in some indeterminate, but valid state

This led me to the conclusion that there are exactly three kinds of values (using the regex notational trick of using a capital letter to indicate a negative – I was in a hurry):

  • iM: has identity and cannot be moved from
  • im: has identity and can be moved from (e.g. the result of casting an lvalue to a rvalue reference)
  • Im: does not have identity and can be moved from

The fourth possibility (“IM”: doesn’t have identity and cannot be moved) is not useful in C++ (or, I think) in any other language. In addition to these three fundamental classifications of values, we have two obvious generalizations that correspond to the two independent properties:

  • i: has identity
  • m: can be moved from

In 2015, Richard Smith, then the C++ standard editor, wrote the paper Guaranteed copy elision through simplified value categories in which he explains the rewording of the value categories of expressions introduced in the C++17 standard**:

However, these rules are hard to internalize and confusing -- for instance, an expression that creates a temporary object designates an object, so why is it not an lvalue? Why is NonMoveable().arr an xvalue rather than a prvalue? This paper suggests a rewording of these rules to clarify their intent. In particular, we suggest the following definitions for glvalue and prvalue:

  • A glvalue is an expression whose evaluation computes the location of an object, bit-field, or function.
  • A prvalue is an expression whose evaluation initializes an object, bit-field, or operand of an operator, as specified by the context in which it appears.

That is: prvalues perform initialization, glvalues produce locations.

Denotationally, we have:

  • glvalue :: Environment -> (Environment, Location)
  • prvalue :: (Environment, Location) -> Environment

So far, this is not a functional change to C++; it does not change the classification of any existing expression. However, it makes it simpler to reason about why expressions are classified as they are:

struct X { int n; };
extern X x;
X{4};   // prvalue: represents initialization of an X object
x.n;    // glvalue: represents the location of x's member n
X{4}.n; // glvalue: represents the location of X{4}'s member n;
        //          in particular, xvalue, as member is expiring

Basically, Smith only reworded Stroustrup’s definition of a prvalue from ‘does not have identity’ to ‘performs initialization’.

I am still unclear about the following things (so these are my questions):

  1. The meaning of Smith’s notations ‘glvalue :: Environment -> (Environment, Location)’ and ‘prvalue :: (Environment, Location) -> Environment’.
  2. The rationale for which Smith’s expression X{4}.n is not a prvalue under the C++17 standard**, since it performs initialization of the complete object X{4} (called ‘temporary object materialization’) and in particular of its subobject n.
  3. The rationale for which Smith’s expression X{4}.n is not a prvalue under the C++11 standard*, since it represents a subobject of a temporary object.

Notes

* The value categories of expressions in the C++11 standard, [basic.lval/1] (bold emphasis mine):

  • An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function or an object. [ Example: If E is an expression of pointer type, then *E is an lvalue expression referring to the object or function to which E points. As another example, the result of calling a function whose return type is an lvalue reference is an lvalue. — end example ]
  • An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for example). An xvalue is the result of certain kinds of expressions involving rvalue references ([dcl.ref]). [ Example: The result of calling a function whose return type is an rvalue reference is an xvalue. — end example ]
  • A glvalue (“generalized” lvalue) is an lvalue or an xvalue.
  • An rvalue (so called, historically, because rvalues could appear on the right-hand side of an assignment expression) is an xvalue, a temporary object ([class.temporary]) or subobject thereof, or a value that is not associated with an object.
  • A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [ Example: The result of calling a function whose return type is not a reference is a prvalue. The value of a literal such as 12, 7.3e5, or true is also a prvalue. — end example ]

** The value categories of expressions in the C++17 standard, [basic.lval/1] (bold emphasis mine):

  • A glvalue is an expression whose evaluation determines the identity of an object, bit-field, or function.
  • A prvalue is an expression whose evaluation initializes an object or a bit-field, or computes the value of the operand of an operator, as specified by the context in which it appears.
  • An xvalue is a glvalue that denotes an object or bit-field whose resources can be reused (usually because it is near the end of its lifetime). [ Example: Certain kinds of expressions involving rvalue references yield xvalues, such as a call to a function whose return type is an rvalue reference or a cast to an rvalue reference type.  — end example ]
  • An lvalue is a glvalue that is not an xvalue.
  • An rvalue is a prvalue or an xvalue.
Géry Ogam
  • 6,336
  • 4
  • 38
  • 67
  • 1
    [basic.lval]/1, while normative in form, not really normative in its content and can be replace with just enumeration of "primary" value categories and how they combine into "secondary" ones with zero impact on the rest of the standard. «denotes an object or bit-field whose resources can be reused» etc. etc. is just some fluff giving your an approximate idea how such expressions are informally treated but which can't really be used to do any normative conclusions. – Language Lawyer Sep 09 '21 at 22:37
  • Re. 1, that looks like haskell syntax. My wishy-washy interpretation: glvalue is something that takes Environment, adds new object to it and returns this new Environment and the Location of that object, prvalue is something that takes the Environment and the Location, initializes the Location, which modifies the Environment, and returns this new Environment. – danadam Sep 09 '21 at 22:58
  • @danadam glvalue doesn't add a new object, it extracts an existing object's location from the environment – Language Lawyer Sep 09 '21 at 23:00
  • 2
    to focus on your example; `X{4}` is a prvalue by definition from [expr.type.conv/2](https://eel.is/c++draft/expr.type.conv#2.sentence-3), `x.n` is an lvalue by definition from [expr.ref/6.2](https://eel.is/c++draft/expr.ref#6.2.sentence-2), `X{4}.n` is an xvalue by the same expr.ref/6.2 sentence. basic.lval plays no role. – Cubbi Sep 10 '21 at 02:51
  • @Cubbi Thanks a lot for the precise references. I have just added a little more detail in the post (although the questions remain the same). – Géry Ogam Sep 10 '21 at 08:02
  • @LanguageLawyer I see, but the concepts guide the technical details which merely enumerate the value categories in all possible situations of the language. So the value category of a simple expression like `X{4}.n` should be easily derived from the concepts without need to look up the precise item in the standard. This is at least my understanding of the philosophy of Smith’s rewording (‘This paper suggests a rewording of these rules to clarify their intent.’). The proof is that he gave the value category of `X{4}.n` only from his concepts. I have just added a little more detail in the post. – Géry Ogam Sep 10 '21 at 08:28
  • @Cubbi Is a prvalue with a result object always [converted to an xvalue](https://timsong-cpp.github.io/cppwp/conv.rval) (e.g. `X{4};`)? And is the result object of a prvalue always a [temporary object](https://timsong-cpp.github.io/cppwp/class.temporary) (e.g. `X x = X{4};`)? – Géry Ogam Sep 12 '21 at 11:54
  • @Maggyero for non-class types - no, `2+2` does not involve temporaries, for class types there's the note in basic.lval https://eel.is/c++draft/basic.lval#5.sentence-5 excluding decltype. But as far as whether all conceivable ways of initializing class objects involve binding a reference (and therefore materializing) a prvalue.. maybe – Cubbi Sep 13 '21 at 18:05

1 Answers1

3
  1. This has been largely answered in the comments, but to elaborate: the semantics of any imperative system can be expressed without side effects by considering the state of “the world” (starting with all of RAM) as an argument to a function and as (part of) its return value. This notation indicates that evaluating a glvalue selects an address (the identity of an object) from that environment (and possibly alters it) whereas evaluating a prvalue requires such a location and alters the environment to contain an initialized object there (possibly with other side effects).
  2. X{4}.n doesn’t initialize n (with what, itself?); it allows access to (i.e., identifies) the value established by just X{4} (which is materialized so as to have a particular n to identify).
  3. You’re right about its temporary status, but that just makes it an rvalue; a prvalue is an rvalue that is not also an xvalue.
Davis Herring
  • 36,443
  • 4
  • 48
  • 76
  • 2
    3. Note that `X{4}.n` was a prvalue in C++11, till DR616 turned it into an xvalue. – Language Lawyer Sep 10 '21 at 08:33
  • Thanks. About 3, you said ‘You’re right about its temporary status, but that just makes it an rvalue;’ I disagree, because the C++11 standard specifies: ‘An *rvalue* (so called, historically, because rvalues could appear on the right-hand side of an assignment expression) is **an xvalue, a temporary object ([class.temporary]) or subobject thereof,** or a value that is not associated with an object.’ In this sentence, ‘xvalue’ and ‘temporary object or subobject thereof’ are exclusive. So `X{4}.n` is a prvalue under the C++11 standard. – Géry Ogam Sep 10 '21 at 13:55
  • @LanguageLawayer Thanks, indeed this is confirmed by [\[expr.ref-4.2\]](https://timsong-cpp.github.io/cppwp/n3337/expr.post#expr.ref-4.2) in C++11: ‘If `E1` is an lvalue, then `E1.E2` is an lvalue; if `E1` is an xvalue, then `E1.E2` is an xvalue; **otherwise, it is a prvalue.**’ So my question 3 is solved. – Géry Ogam Sep 10 '21 at 14:05
  • About 2, you said: ‘`X{4}.n` doesn’t initialize `n` (with what, itself?);’ No, with its complete object `X{4}`. If the subobject `n` was not initialized by the expression, how would you be able to access it in the first place? You also said: ‘it allows access to (*i.e.*, **identifies**) the value established by just `X{4}`’ The discarded prvalue in the statement `X(4);` also allows access to its value. So following your logic it should be an xvalue as well. Yet it is a prvalue. – Géry Ogam Sep 10 '21 at 14:24
  • 1
    @Maggyero: `n` is initialized as one of the effects of evaluating `X{4}.n`, but it’s not initialized by the expression as a whole; it would be silly to say that `std::cout << 0;` is a prvalue because it initializes a sentry object. That `X{4}` can be seen as describing an identity is one of the main points of confusion Richard resolved: it doesn’t anymore because you can write `X x=X(X{4});` and still get only one object that is conceived by the definition, not any part of the initializer. – Davis Herring Sep 10 '21 at 15:15
  • I think I get it. The key is to only look at the *full* expression, not a subexpression of it. So a *prvalue* is an expression that *as a whole* creates a new object (or equivalently does not denote an existing object), and a *glvalue* is an expression that *as a whole* denotes an existing object (or equivalently does not create a new object). With this interpretation the classification of Smith’s expressions is straightforward: `X{4}` is a prvalue, `x.n` is a glvalue, and `X{4}.n` is a glvalue. – Géry Ogam Sep 10 '21 at 23:56
  • Now what about `X(X(X{4}))`? Doesn’t the full expression denote the existing object created by the subexpression `X{4}`, making the full expression a glvalue? – Géry Ogam Sep 10 '21 at 23:56
  • 1
    @Maggyero: In C++11, it denotes the last of several temporaries created, one from another. In C++17, it’s all one prvalue that has yet to materialize anything. – Davis Herring Sep 11 '21 at 03:15
  • I see. Is a prvalue with a result object always [converted to an xvalue](https://timsong-cpp.github.io/cppwp/conv.rval) (e.g. `X{4};`)? Is the result object of a prvalue always a [temporary object](https://timsong-cpp.github.io/cppwp/class.temporary) (e.g. `X x = X{4};`)? – Géry Ogam Sep 11 '21 at 20:54
  • @Maggyero: Discarded values are generally materialized in C++17 where that matters, but not always (*e.g.*, in `decltype`); the non-temporary `x` is the result object there. – Davis Herring Sep 11 '21 at 20:58
  • Okay so result object = xvalue (`decltype` prvalue operands do not have result objects so they do not matter here)? If `X x = X{4};` creates a *non*-temporary `x`, then why does `const X& x = X{4};`, which extends the lifetime of the result object, still create a [temporary object](https://timsong-cpp.github.io/cppwp/class.temporary#6) `x`? – Géry Ogam Sep 11 '21 at 21:10
  • @Maggyero: The result object is not an *expression*, so it isn’t an xvalue or any other kind. A lifetime-extended temporary is still a temporary because it’s still materialized *from* a prvalue (to provide a glvalue to which the reference can be bound) rather than being a variable’s object initialized *with* that prvalue. – Davis Herring Sep 11 '21 at 22:00
  • Sorry, I meant: is a prvalue *that has a result object* always converted to an xvalue? So ‘temporary object’ means ‘auxiliary object for another entity’ rather than ‘transient object whose lifetime is limited by the full expression’? – Géry Ogam Sep 11 '21 at 23:52
  • @Maggyero: No, in C++17 prvalues are not converted to xvalues when initializing an object of the same type. – Davis Herring Sep 12 '21 at 01:08
  • Yet according to [\[conv.rval\]](https://timsong-cpp.github.io/cppwp/conv.rval): *‘A prvalue of type T can be converted to an xvalue of type T. This conversion initializes a temporary object ([class.temporary]) of type T from the prvalue by evaluating the prvalue with the temporary object as its result object, and produces an xvalue denoting the temporary object.’* So my understanding is that a discarded prvalue initializes a temporary object of the same type and produces an xvalue denoting that object (e.g. `X{4};` where `X{4}` is both a prvalue and an xvalue produced by the prvalue). – Géry Ogam Sep 12 '21 at 11:48
  • 1
    @Maggyero: True, but that doesn’t make the (unconverted) expression an xvalue any more than `'a'+0` makes `'a'` itself an `int`, and (in case it’s not clear) the converted xvalue isn’t used to initialize anything. – Davis Herring Sep 12 '21 at 16:30
  • So could we say that there are two kinds of expressions: *compile-time* expressions and *run-time* expressions (for instance converted expressions are of the latter kind)? E.g. in the statement `X{4}.n;`, the subexpression `X{4}` is both a compile-time prvalue and a run-time xvalue, and the full expression `X{4}.n` is both a compile-time and run-time xvalue. – Géry Ogam Sep 13 '21 at 15:09
  • 1
    @Maggyero: I don’t see what that terminology adds beyond “converted”/“unconverted”, nor why one would be called “run-time”. – Davis Herring Sep 13 '21 at 16:37
  • I used ‘run-time value category’ as ‘run-time type’. But my terminology was probably unnecessary. So far we said: some prvalues *create* an object (the *result object*), which can either be a temporary object (e.g. `X{4}` in `X&& x = X{4};`) or a non-temporary object (e.g. `X{4}` in `X x = X{4};`). Now some prvalues *create* a value (no result object), which is an operand of a built-in operator (e.g. `1` in `i + 1;` and `i = 1;`). In `i = 1;`, since `1` does not create an *object* but the *value* 1, the lifetime of the original object `i` does not end, its value is just updated, do you agree? – Géry Ogam Sep 14 '21 at 06:07
  • 1
    @Maggyero: [basic.lval]/1.2 already gives the two possibilities as “initializes an object or computes the value of an operand”, although it would be unobservable to instead say that `i=1;` materialized an xvalue and copied its value into `i`. But prvalues never **create** objects; the result object is supplied externally, albeit perhaps implicitly via materialization. – Davis Herring Sep 14 '21 at 06:54
  • Don’t you mean prvalues don’t always create *temporary* objects (they can also create non-temporary objects, i.e. not materialize a temporary object, and they can also create values)? Because what is the difference between object *creation/construction* and object *initialization*? – Géry Ogam Sep 14 '21 at 07:02
  • 2
    @Maggyero prvalues do not create objects at all. They initialize them. And can initialize both temporary and non-temporary objects. – Language Lawyer Sep 14 '21 at 12:44
  • 1
    @Maggyero: Creating an object involves selecting its storage duration, for instance. The initializing prvalue (if any) plays no role in that. – Davis Herring Sep 14 '21 at 13:16
  • So are these equations correct: object initialization = object construction, and object creation = storage allocation + object initialization? – Géry Ogam Sep 14 '21 at 19:52
  • 1
    @Maggyero: “Construction” is just the name of (much of) the initialization process for classes; creation includes the storage allocation, but may or may not include the initialization (consider a static local variable). – Davis Herring Sep 14 '21 at 21:08
  • A static local variable is [statically-initialized](https://timsong-cpp.github.io/cppwp/dcl.init#general-10) at program startup and [dynamically-initialized](https://timsong-cpp.github.io/cppwp/stmt.dcl#4) the first time control passes through its declaration. Why wouldn’t creation include initialization in this case? – Géry Ogam Sep 14 '21 at 22:13
  • @Maggyero: The static initialization doesn’t begin its lifetime if it has dynamic initialization, at least for objects of class type. If you have yet more questions, please ask them separately: this comment thread is far too long. – Davis Herring Sep 15 '21 at 00:21
  • I have just asked a separate question [here](https://stackoverflow.com/q/69189726/2326961). By the way, thanks for bearing with me so far, I am learning a lot. – Géry Ogam Sep 15 '21 at 08:40
  • @LanguageLawyer [Here](https://stackoverflow.com/q/69189726/2326961) in comments, Davis is wondering if in `int j = i;` an lvalue-to-rvalue conversion happens or not. Any idea? – Géry Ogam Sep 15 '21 at 16:03
  • @Maggyero There are lotsa places where L2R conversion requirement is missing – Language Lawyer Sep 18 '21 at 15:38
  • @LanguageLawyer Where else? – Géry Ogam Sep 18 '21 at 21:33
  • @Maggyero most of operators – Language Lawyer Sep 19 '21 at 07:34
  • @LanguageLawyer Is it worth opening an issue on the cplusplus/draft GitHub repository? – Géry Ogam Sep 19 '21 at 10:33
  • @Maggyero There is a CWG issue for the operators case – Language Lawyer Sep 19 '21 at 11:00
  • @LanguageLawyer [This issue](https://github.com/cplusplus/draft/issues/4917)? Thanks for the report! – Géry Ogam Sep 19 '21 at 15:44
  • 1
    @Maggyero CWG1642. – Language Lawyer Sep 19 '21 at 16:55