1

Consider the following for-in loop in Rust, which moves the non-privative data type String.

let strings: Vec<String> = something;
for s in strings {
    // uses `s`
}

Since a String is not a bitwise trivially copyable datatype, the elements of strings are moved, one by one, into s and then dropped at the end of the block defined by { and }.

After each iteration, what does the Vec object look like?

Imaging that strings is initialized to contain this:

[ "hello", "world", "final string" ]

After 1 iteration of the loop, the element "hello" has been moved.

We cannot have a Vec now looking like this

[ None, "world", "final string" ]

because None is not a String.

This means the Vec must now be shorter:

[ "world", "final string" ]

Since "hello" was removed from the front of the Vec this strongly suggests to me that a re-allocation has taken place.

If this is not the case, then why does no re-allocation occur.

AFAIK Vec does not contain something like an offset to the first element in memory, like a C++ std::deque would. So it seem to me that Rust Vec cannot avoid a re-allocation when the first element is moved.

Finally, assuming I have understood all of the above, which may not be the case, does this mean that iterating over a Vec containing not-copyable objects is slow ?

FreelanceConsultant
  • 13,167
  • 27
  • 115
  • 225
  • Relevant: https://stackoverflow.com/q/59123462 – E_net4 Apr 18 '23 at 13:47
  • @mkrieger1 This example comes directly from Programming Rust (O'Reilly).... Good point. Not sure if their example is invalid Rust code? It raises a slightly different question about how is it possible to iterate over this `Vec`. – FreelanceConsultant Apr 18 '23 at 13:50

2 Answers2

6

TL;DR: It doesn't matter.

Once you started iterating over the Vec, you moved it and you cannot access it anymore, so it doesn't matter what its internal state is.

If you are asking about the implementation, then the Vec itself doesn't change. The iterator holds a pointer to the beginning of the Vec (to free its memory at the end) and to the next and last elements. The elements already iterated over are just discarded, without their memory physically overwritten.

oguz ismail
  • 1
  • 16
  • 47
  • 69
Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77
  • The `Vec` isn't mutable @ChayimFriedman - So that's not quite it. I think it has more to do with the interaction of `for - in` and the `IntoIterator` trait. – Kevin Anderson Apr 18 '23 at 13:49
  • Is this were the case the `Iterator` would be required in order to free the memory... And I don't think that makes sense? – FreelanceConsultant Apr 18 '23 at 13:51
  • 4
    @KevinAnderson Once a thing is moved, it doesn't matter if it was declared mutable or not. Also, the iterator is mutated, not the `Vec`. – Chayim Friedman Apr 18 '23 at 13:51
  • @FreelanceConsultant Why don't you think it makes sense? – Chayim Friedman Apr 18 '23 at 13:52
  • @ChayimFriedman Coming from C++, it would be weird if a destructor call required an iterator as an argument. Maybe I misunderstood what you intended to say? – FreelanceConsultant Apr 18 '23 at 13:53
  • It would also be weird if an iterator was a member of a container class, hence imposing the constraint that there can only be one iterator per class object. – FreelanceConsultant Apr 18 '23 at 13:54
  • @FreelanceConsultant The destructor or the container doesn't require an iterator, the iterator's destructor needs to free the items, which may involve calling the container destructor (but likely not). And I don't understand why you think there can be only one iterator per container (there is only one `IntoIterator` implementation, but there can be multiple applicable iterators). – Chayim Friedman Apr 18 '23 at 13:57
  • Note that for the specific case of `Vec`, each iteration deallocates one string buffer. The vector itself is only deallocated at the very end as one block. – Sven Marnach Apr 18 '23 at 13:57
  • @SvenMarnach Yes, but this is done by the consumer code (although if the loop is `break`ed the iterator will free the remaining `String`s). – Chayim Friedman Apr 18 '23 at 13:59
  • @ChayimFriedman If there is an iterator which exists, and that iterator stores information about "how far along a container it has iterated", then that information would be required to free the memory of the container, if the act of iterating moves elements of the container. Does that make sense? – FreelanceConsultant Apr 18 '23 at 13:59
  • @FreelanceConsultant Yes, true, that's why it stores this information. What is the question? – Chayim Friedman Apr 18 '23 at 14:01
  • Then the containers destructor requires information about how far the iterator has iterated. Otherwise, it doesn't know which elements to free and which elements to not double-free. – FreelanceConsultant Apr 18 '23 at 14:02
  • 1
    @FreelanceConsultant But once you started iterating, you moved the container and it's destructor won't be called. It's the iterator's destructor whose job is to free the elements now. – Chayim Friedman Apr 18 '23 at 14:05
  • 1
    TIL - Even if non-mut, you can still move/consume a variable, not just copy or borrow it. That's really backwards IMO, but hey, it's probably widespread. That's just really unexpected, since `const` really means it in C++ (yes I know about the `mutable` keyword, exceptions to the rule aside I mean). – Kevin Anderson Apr 18 '23 at 14:16
  • @ChayimFriedman That's still very strange as it implies transfer of ownership from a container to an iterator. While I don't disagree, that is something that seems possible, it also seems... weird – FreelanceConsultant Apr 18 '23 at 14:39
  • 1
    @FreelanceConsultant This is how Rust works with moves, and it is also usually how C++ works with move (C++ just doesn't have moving iterators because it doesn't need them). – Chayim Friedman Apr 18 '23 at 14:41
  • @ChayimFriedman How would you write something equivalent in C++? I can't see a way of doing that – FreelanceConsultant Apr 18 '23 at 14:58
0

The for item in container {} syntax implicitly calls container.into_iter(), which consumes container, and transfers ownership of its contents into the IntoIterator object.

On each iteration of the for loop, the IntoIterator a String is processed and then dropped.

Here's an example that explicitly calls into_iter() and has explicit type annotations to more clearly demonstrate what's happening:

fn main() {
    let container: Vec<String> = vec!["Hello".to_string(), "World".to_string()];

    {
        let iterator: std::vec::IntoIter<String> = container.into_iter(); // consumes `container`

        // Now `iterator` owns the items that were in `container`.
        // dbg!(container);  // error[E0382]: use of moved value: `container`
        for item in iterator {
            println!("item: {item}");
        }
        // ^^ The iterator will be dropped after the for loop.
    }
}

Rust Playground Link

Colin D Bennett
  • 11,294
  • 5
  • 49
  • 66
  • 1
    I would change the wording saying "transfers ownership [...] until the `for` loop is done.", because ownership is actually transferred into the `for` loop and never returned back; You cannot get `iterator` back, and the strings are actually dropped on each loop, and the iterator right after the `for` loop ends, not at the end of scope like you said. – Filipe Rodrigues Apr 18 '23 at 18:35
  • Ok that's interesting, but still raises questions about how the memory for each element is managed internally. If there is an iterator object which takes ownership of the memory managed by the original container, this suggests to me that one vector like datastructure is being transformed into something else like a deque. (Double ended queue, maybe even something with links in it like a list.) It just seems slightly suspicious to me that something that can only have `push`, `pop` behaviour suddenly becomes something with `pop_front` behaviour. – FreelanceConsultant Apr 19 '23 at 07:30