33

I'm looking for a method that consumes a Vec and returns one element, without the overhead of restoring Vec's invariants the way remove and swap_remove do:

fn take<T>(vec: Vec<T>, index: usize) -> Option<T>

However, I can't find such a method. Am I missing something? Is this actually unsafe or impossible?

This is a different question from Built in *safe* way to move out of Vec<T>? There the goal was a remove method that didn't panic on out of bounds access and returned a Result. I'm looking for a method that consumes a Vec and returns one of the elements. None of the answers to the above question address my question.

Others
  • 2,876
  • 2
  • 30
  • 52
  • Do you mean specifically `Option` rather than `Option<&T>` that you can get from `vec.get(index)`, or did you miss that `.get` exists? – loganfsmyth Aug 21 '17 at 19:04
  • 2
    @loganfsmyth I specifically mean `Option` like it says in my question. What I want is similar to `option.take()` if that makes sense? – Others Aug 21 '17 at 19:05
  • 1
    Note: if you find yourself throwing `Vec` on a regular basis, you may want to see if you can avoid materializing them to start with. Not doing anything is always faster than doing something, no matter how efficient you are at doing it. – Matthieu M. Aug 22 '17 at 09:37

2 Answers2

25

You can write your function like this:

fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
    if vec.get(index).is_none() {
        None
    } else {
        Some(vec.swap_remove(index))
    }
}

The code you see here (get and swap_remove) is guaranteed O(1).

However, kind of hidden, vec is dropped at the end of the function and this drop operation is likely not O(1), but O(n) (where n is vec.len()). If T implements Drop, then drop() is called for every element still inside the vector, meaning dropping the vector is guaranteed O(n). If T does not implement Drop, then the Vec only needs to deallocate the memory. The time complexity of the dealloc operation depends on the allocator and is not specified, so we cannot assume it is O(1).


To mention another solution using iterators:

fn take<T>(vec: Vec<T>, index: usize) -> Option<T> {
    vec.into_iter().nth(index)
}

While Iterator::nth() usually is a linear time operation, the iterator over a vector overrides this method to make it a O(1) operation. Of course, if T implements Drop, this again is an O(n) function as n elements need to be dropped.

Lukas Kalbertodt
  • 79,749
  • 26
  • 255
  • 305
  • 1
    Ah, this is a bit of an ugly solution, but I think it should work! I still think a `take` method on Vec would be great... – Others Aug 21 '17 at 19:38
  • @Others Maybe you should open a RFC – Boiethios Aug 22 '17 at 05:43
  • How about changing `mut vec: Vec` to `vec: &mut Vec` in first solution? – lukwol Aug 22 '17 at 06:18
  • 1
    @lukwol: Why? The OP explicitly asked something that consumes the vector. – Matthieu M. Aug 22 '17 at 09:19
  • 1
    I don't know what was the case when this answer was written, but as of today, `into_iter().nth()` _is_ O(1), unless the items implement `Drop` - but that is also true about the "`swap_remove()` then drop the `Vec`" approach. – Chayim Friedman May 04 '23 at 15:08
7

The reason fn take<T>(vec: Vec<T>, index: usize) -> Option<T> does not exist in the standard library is that it is not very useful in general. For example, supposing that you have a Vec<String> of length 10, it means throwing away 9 strings and only using 1. This seems wasteful.

In general, the standard library will try to provide an API that is useful in a maximum of scenarios, and in this instance it would be more logical to have a fn take<T>(vec: &mut Vec<T>, index: usize) -> Option<T>.

The only question is how to preserve the invariant, of course:

  • it can be preserved by exchanging with the last element, which is what Vec::swap_remove does,
  • it can be preserved by shifting the successor elements in, which is what Vec::drain does.

Those are very flexible, and can be adapted to fill more specific scenarios, such as yours.


Adapting swap_remove:

fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
    if index < vec.len() {
        Some(vec.swap_remove(index))
    } else {
        None
    }
}

Adapting drain:

fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
    if index < vec.len() {
        vec.drain(index..index+1).next()
    } else {
        None
    }
}

Noting that the former is more efficient: it's O(1).


I'm looking for a method that consumes the Vec and returns one element, without the overhead of restoring Vec's invariants the way remove and swap_remove do.

This reeks of premature micro-optimization to me.

First of all, note that it is necessary to destroy the elements of the vector; you can accomplish this in two ways:

  1. swap_remove, then iterate over each element to destroy them,
  2. Iterate over each element to destroy them, skipping the specific index.

It is not clear to me that the latter would be faster than the former; if anything it looks more complicated, with more branches (I advise two loops), which may throw off the predictor and may be less amenable to vectorization.

Secondly, before complaining about the overhead of restoring the Vec's invariant, have you properly profiled the solution?

If we look at the swap_remove variant, there are 3 steps:

  1. swap_remove (O(1)),
  2. destroy each remaining element (O(N)),
  3. free the backing memory.

Step 2 may be optimized out if the element has no Drop implementation, but otherwise I would be it's a toss whether (2) or (3) is dominating the cost.

TL;DR: I am afraid that you are fighting ghost issues, profile before trying to optimize.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 1
    I'm not necessarily only complaining about the overhead of restoring Vec's invariants (although the code I'm working with is in a tight inner loop, so speed really does matter to me), I also want to write code that follows what I mean. In this case, I really want to express the `take` operation, not a `remove` operation. Even if this was just a speed issue, I need an alternative to profile against anyway. – Others Aug 22 '17 at 15:16
  • 1
    @Others: "Tight inner loop" and "dropping `Vec`" don't sound too good together. Once you've written the code, I encourage you to post it to Code Reviews or reddit and ask how to optimize it; hopefully there'll be a way to avoid a memory deallocation in your inner loop as those are costly. – Matthieu M. Aug 22 '17 at 16:38