Need clarification on the Rust Nomicon section on (co)variance of `Box`, `Vec` and other collections

Question

The Rust Nomicon has an entire section on variance which I more or less understand except this little section in regards to Box<T> and Vec<T> being (co)variant over T.

Box and Vec are interesting cases because they're variant, but you can definitely store values in them! This is where Rust gets really clever: it's fine for them to be variant because you can only store values in them via a mutable reference! The mutable reference makes the whole type invariant, and therefore prevents you from smuggling a short-lived type into them.

What confuses me is the following line:

it's fine for them to be variant because you can only store values in them via a mutable reference!

My first question is that I'm slightly confused as to what the mutable reference is to. Is it a mutable reference to the Box / Vec?

If so, how does the fact that I can only store values in them via a mutable reference justify their (co)variance? I understand what (co)variance is and the benefits of having it for Box<T>, Vec<T> etc., but I am struggling to see the link between only being able to store values via mutable references and the justification of (co)variance.

Also, when we initialize a Box, aren't values moved into the box without involving an mutable reference? Doesn't this contradict the statement that we can only store values in them via mutable reference?

And finally, under what context is this 'mutable reference' borrowed? Do they mean that when you call methods that modify the Box or Vec you implicitly take an &mut self? Is that the mutable reference mentioned?

Update 2nd May 2018:

Since I have yet to receive a satisfactory answer to this question, I take it that the nomicon's explanation is genuinely confusing. So as promised in a comment thread below, I have opened an issue in the Rust Nomicon repository. You can track any updates there.

It's funny, I was just reading that page today, and didn't really get that part either. The question that I have is, what does the fact that `&mut Box` being invariant over `T` actually prevent? For example, [this code](https://play.rust-lang.org/?gist=457974e57f4e702aab7bf7a7b0c817f6&version=stable) that replaces the `&'a str` in a `Box<&'a str>` with an `&'static str`, works fine, as it should, but it seems like the kind of thing that would be disallowed because `&mut T` is invariant over `T`. — Michael Hewson, Apr 24 '18 at 09:01

Peter Hall · Answer 1 · 2018-04-24T11:58:28.050

2

I think that section could use some work to make it clearer.

I'm slightly confused as to what the mutable reference is to. Is it a mutable reference to the Box / Vec?

No. It means, if you store values in an existing Box, you'd have to do that via a mutable reference to the data, for example using Box::borrow_mut().

The main idea this section is trying to convey is that you can't modify the contents of a Box while there is another reference to the contents. That's guaranteed because the Box owns its contents. In order to change the contents of a Box, you have to do it by taking a new mutable reference.

This means that — even if you did overwrite the contents with a shorter-lived value — it wouldn't matter because no one else could be using the old value. The borrow checker wouldn't allow it.

This is different from function arguments because a function has a code block which can actually do things with its arguments. In the case of a Box or Vec, you have to get the contents out, by mutably borrowing them, before you can do anything to them.

edited Apr 24 '18 at 11:58

answered Apr 24 '18 at 10:57

Peter Hall

53,120
14
139
204

Thanks for the response. I've thought about your answer for a long while and I think I've got the gist of what you're saying. I hope you can check if my understanding is correct. – L.Y. Sim Apr 24 '18 at 18:47
In the paragraph above the `Box` and `Vec` section in the nomicon, it was explained that an `&mut T` cannot be covariant over `T` because `T` has an owner that should ultimately 'control' the lifetime of `T`. But had we allowed `&mut T` to be covariant over `T`, then we could have used a short lived value where a longer one is required, as shown in the Nomicon's sample code with the function `overwrite`. The invariance protects us from that illegal operation. – L.Y. Sim Apr 24 '18 at 18:47
Allowing a `Box` to be covariant over `E` does not introduce the risk of having its stored value be substituted with an illegal one because the only way to modify a value being stored in a `Box` is to first obtain a mutable reference to the value (e.g. use `.borrow_mut()` to get `&mut E`). Therefore in every possible context where we intend to modify the value within a `Box`, we have to go through an `&mut E` which is invariant in `E`, which then means we can't do the illegal operation, which also means `Box` is effectively invariant in `E` in a mutation context. – L.Y. Sim Apr 24 '18 at 18:48
1

@LYSim I think you're overthinking it. The text in the nomicon is not well written and I don't think you should worry about understanding it if you already understand why it's ok for `T` to be covariant in `Box<T>`. What you've said is basically right, though I'm not sure about your last sentence. Any `Box<T>` can be mutated via `borrow_mut()` and the variance we're talking about here is in the argument to the _type constructor_. The whole discussion here is to show that it's safe for that variance to be permitted. – Peter Hall Apr 25 '18 at 11:23
Thank you @Peter Hall, I think I understand the issue a bit more now. If I don't receive any other answers to this question, I'll contact the guys who do the documentation and see if they can improve the section. – L.Y. Sim Apr 25 '18 at 12:12

attdona · Answer 2 · 2018-05-02T06:26:24.323

1

From the nomicom:

Box and Vec are interesting cases because they're variant, but you can definitely store values in them! This is where Rust gets really clever: it's fine for them to be variant because you can only store values in them via a mutable reference! The mutable reference makes the whole type invariant, and therefore prevents you from smuggling a short-lived type into them.

Consider Vec method to add a value:

pub fn push(&'a mut self, value: T)

The type of self is &'a mut Vec<T> and I understand that this is the mutable reference nomicom is speaking about, so instantiating for the Vec case the last sentence of the above phrase become:

The type &'a mut Vec<T> is invariant, and therefore prevents you from smuggling a short-lived type into Vec<T>.

The same reasoning holds for Box.

Said in another way: the values contained by Vec and Box always outlive their container despite Vec and Box being variant because you can only store values in them via a mutable reference.

Consider the following snippet:

fn main() {
    let mut v: Vec<&String> = Vec::new();

    {
        let mut a_value = "hola".to_string();

        //v.push(a_ref);
        Vec::push(&mut v, &mut a_value);
    }

    // nomicom is saing that if &mut self Type was variant here we have had
    // a vector containing a reference pointing to freed memory

    // but this is not the case and the compiler throws an error
}

It should help to note similarity of Vec::push(&mut v, &mut a_value) with overwrite(&mut forever_str, &mut &*string) from the nomicom example.

edited May 02 '18 at 06:26

answered May 01 '18 at 19:42

attdona

17,196
7
49
60

That was the initial impression I got, that modifying a `Box` or `Vec` requires getting an `&mut self` reference, and that is the mutable reference mentioned. But that doesn't jive with the statement that 'you can only store values in them via a mutable reference!', since you can instantiate a `Box` with an initial value without involving a mutable reference. – L.Y. Sim May 02 '18 at 01:39
Furthermore, there's this sentence you mentioned: "The type `&'a mut Vec` is invariant, and therefore prevents you from smuggling a short-lived type into `Vec`.". I understand that `&'a mut Vec` is invariant over `Vec`. But, I am struggling to visualize how said invariance prevents us from 'inserting' a shorter-lived element into `Vec`. Could you perhaps write an example code to show me what you mean? – L.Y. Sim May 02 '18 at 01:43
Hi Sim, I added a snippet. I hope it may help – attdona May 02 '18 at 06:27
I think I finally get it now. Let me summarize to see if I'm correct. Consider in your example lifetimes `'b : 'a` where say `'b` is some lifetime in the enclosing scope, and `'a` is the lifetime of `"hola"` in the nested block. Now, since in general `&mut T` is invariant over `T`, this means that `&mut Vec<&'b str> : &mut Vec<&'a str>` is **not true even though** `Vec<&'b str> : Vec<&'a str>` is true. – L.Y. Sim May 02 '18 at 07:15
Therefore, while `Vec::push()` will happily accept an `&mut self` argument of `&mut Vec<&'a str>`, it cannot (and shouldn't) accept `&mut Vec<&'b str>` because it's not a subtype of the former. This then prevents us from being able to store items that do not live long enough into `Vec<&'b str>` (such as `"hola"`). And that is what was meant when the nomicon said that taking a mutable reference makes 'the whole type invariant'. – L.Y. Sim May 02 '18 at 07:30
Yes, you summarized exactly what I've understand about variance related to lifetimes. – attdona May 02 '18 at 07:41
Honestly what you've said seems to make the most sense. Showing `push()` in terms of a qualified method call was what really helped me. I'm selecting your answer as best. We'll see what the nomicon maintainers say in the open issue. – L.Y. Sim May 02 '18 at 07:45
For the Box case, see this [example](https://play.rust-lang.org/?gist=8db43eb47f3ed62147e30d7003b632c5&version=stable&mode=debug). Box::new create a boxed instance it does not assign a value to something that already exists and has a lifetime. This is what I understand, I hope to not add confusion to the argument! – attdona May 02 '18 at 08:01

L.Y. Sim · Accepted Answer · 2018-05-08T15:30:34.320

Since opening the issue in the Nomicon repo, the maintainers have introduced a revision to the section which I feel is considerably clearer. The revision has been merged. I consider my question answered by the revision.

Below I provide a brief summary of what I know.

The part that relates to my question now reads as follows (emphasis mine):

Box and Vec are interesting cases because they're covariant, but you can definitely store values in them! This is where Rust's typesystem allows it to be a bit more clever than others. To understand why it's sound for owning containers to be covariant over their contents, we must consider the two ways in which a mutation may occur: by-value or by-reference.

If mutation is by-value, then the old location that remembers extra details is moved out of, meaning it can't use the value anymore. So we simply don't need to worry about anyone remembering dangerous details. Put another way, applying subtyping when passing by-value destroys details forever. For example, this compiles and is fine:
 fn get_box<'a>(str: &'a str) -> Box<&'a str> {
     // String literals are `&'static str`s, but it's fine for us to
     // "forget" this and let the caller think the string won't live that long.
     Box::new("hello") }
If mutation is by-reference, then our container is passed as &mut Vec<T>. But &mut is invariant over its value, so &mut Vec<T> is actually invariant over T. So the fact that Vec<T> is covariant over T doesn't matter at all when mutating by-reference.

The key point here really is the parallel between the invariance of &mut Vec<T> over T and the invariance &mut T over T.

It was explained earlier in the revised nomicon section why a general &mut T cannot be covariant over T. &mut T borrows T, but it doesn't own T, meaning that there are other things that refer to T and have a certain expectation of its lifetime.

But if we were allowed to pass &mut T covariant over T, then the overwrite function in the nomicon's example shows how we can break the lifetime of T in the caller's location from a different location (i.e. within the body of overwrite).

In a sense, allowing covariance over T for a type constructor allows us to 'forget the original lifetime of T' when passing the type constructor, and this 'forgetting the original lifetime of T' is ok for &T because there is no chance of us modifying T through it, but it's dangerous when we have an &mut T because we have the ability to modify T after forgetting lifetime details about it. This is why &mut T needs to be invariant over T.

It seems the point the nomicon is trying to make is: it's OK for Box<T> to be covariant over T because it does not introduce unsafeness.

One of the consequences of this covariance is that we are allowed to 'forget the original lifetime of T' when passing Box<T> by value. But this does not introduce unsafeness because when we pass by value, we guaranteeing that there are no further users of T in the location that Box<T> was moved from. No one else in the old location is counting on the previous lifetime of T to remain so after the move.

But more importantly, Box<T> being covariant over T does not introduce unsafeness when it comes to taking a mutable reference to the Box<T>, because &mut Box<T> is invariant over Box<T> and therefore invariant over T. So, similar to the &mut T discussion above, we are unable to perform lifetime shenanigans through an &mut Box<T> by forgetting lifetime details about T and then modifying it after.

score 0 · Answer 4 · answered Apr 25 '18 at 02:32

0

I guess the point is that, while you can convert a Box<&'static str> to a Box<&'a str> (because Box<T> is covariant), you can't convert an &mut Box<&'static str> to an &mut Box<&'a str> (because &mut T is invariant).

answered Apr 25 '18 at 02:32

Michael Hewson

1,444
13
21

What you said sort of makes sense, but I'm struggling to see the connection between not being able to use an `&mut Box<&'static str>` where an `&mut Box<&'a str>` is expected and this sentence: "it's fine for them to be variant because you can only store values in them via a mutable reference!". – L.Y. Sim Apr 25 '18 at 03:26
After reading the response by @peter-hall, I'm more inclined to agree with him that the 'mutable reference' mentioned is the mutable reference to the item being stored in the `Box` or `Vec`. But that leaves the question of what mechanism actually makes the `Box` or `Vec` invariant *when* the mutable reference is taken. – L.Y. Sim Apr 25 '18 at 03:52
heads up, the nomicon has revised the section, you might want to give it a look. I've posted a link and summarized what I know in another answer to this question. – L.Y. Sim May 04 '18 at 04:23

Need clarification on the Rust Nomicon section on (co)variance of `Box`, `Vec` and other collections

4 Answers4