2

I'm trying to implement a function to return a vector of all strings containing a pattern out of (Vec<String>) and into another Vec<String>.

This is what I tried:

fn select_lines(pattern: &String, lines: &Vec<String>) -> Vec<String> {
    let mut selected_lines: Vec<String> = Vec::new();

    for line in *lines {
        if line.contains(pattern) {
            selected_lines.push(line);
        }
    }

    selected_lines
}

The leads to an error on the line with the for loop (at *lines). I'm very new to Rust (started learning Rust yesterday!) and right now almost clueless on how to resolve this error.

I can remove the * and that error goes away but errors regarding type mismatch start to culminate. I would like to keep the signature of the function intact. Is there a way?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
jdnjd
  • 23
  • 5

3 Answers3

6

The issue is that you're trying to move ownership of the String instances out of your lines parameter (which is an input parameter) ... transferring ownership into the return value (the output).

There are a couple of options for you.

Option 1 - Clone

The easiest to grok for you would be to just clone the lines out:

selected_lines.push(line.clone());

Now that you've cloned the lines ... there's no ownership issue. What you're returning is new instances of Strings in a vector. They're just copies of the ones you passed in.

Option 2 - Lifetimes

Another option (to avoid the extra allocations), is to just let the compiler know that you're not going to return any references that are left dangling:

// introduce a lifetime to let the compiler know what you're
// trying to do. This lifetime basically says "the Strings I'm returning
// in the vector live for at least as long as the Strings coming in
fn select_lines<'a>(pattern: &String, lines: &'a Vec<String>) -> Vec<&'a String> { 
    let mut selected_lines: Vec<&String> = Vec::new();

    for line in lines {
        if line.contains(pattern) {
            selected_lines.push(line);
        }
    }

    selected_lines
}

That is how you can fix your immediate problem.

Another spin

If I were to write this though, I would change it slightly. Here's another spin on it:

fn select_lines<I>(pattern: I, lines: &[I]) -> Vec<&str>
where
    I: AsRef<str>,
{
    let mut selected_lines: Vec<&str> = Vec::new();

    for line in lines {
        if line.as_ref().contains(pattern.as_ref()) {
            selected_lines.push(line.as_ref());
        }
    }

    selected_lines
}

You can use this version with Strings, or &strs, vectors, or slices.

let lines = vec!["Hello", "Stack", "overflow"];

let selected = select_lines("over", &lines);

// prints "overflow"
for l in selected {
    println!("Line: {}", l);
}

let lines2 = [String::from("Hello"), String::from("Stack"), "overflow".into()];

let selected2 = select_lines(String::from("He"), &lines2);

// prints "Hello"
for l in selected2 {
    println!("Line again: {}", l);
}

Here it is running on the playground

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Simon Whitehead
  • 63,300
  • 9
  • 114
  • 138
  • I find it interesting that the "another spin" method doesn't require explicit lifetimes (because the compiler can infer them?), but then I'm not exactly the expert on lifetimes either. – MutantOctopus Apr 10 '18 at 06:14
  • It does get a bit "murkier" depending on how you use that option @BHustus. If you're passing these arguments through multiple callsites then the compiler will begin to complain about type mismatches. When this scenario pops up I either bite the bullet and just `Clone` things or try for _smaller_ allocations with a tiny wrapper object (much like the stdlib does with iterator types). For the OPs simple example though it should work alright. – Simon Whitehead Apr 10 '18 at 06:38
  • @BHustus yes, the [compiler infer lifetimes](https://doc.rust-lang.org/book/second-edition/ch10-03-lifetime-syntax.html): If there is exactly one input lifetime parameter, that lifetime is assigned to all output lifetime parameters – attdona Apr 10 '18 at 08:42
1

The other answers are correct, but the idiomatic solution would not involve reinventing the wheel:

fn main() {
    let lines = vec!["Hello", "Stack", "overflow"];

    // Vec<String>
    let selected: Vec<_> = lines
        .iter()
        .filter(|l| l.contains("over"))
        .cloned()
        .collect();
    println!("{:?}", selected);

    // Vec<&String>
    let selected: Vec<_> = lines
        .iter()
        .filter(|l| l.contains("over"))
        .collect();
    println!("{:?}", selected);
}

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
0

Short version: Remove the dereference and push(line.clone()) instead.


The why

Take the following code:

fn main() {
    let mut foo = vec![1, 2, 3, 4, 5];

    for num in foo {
        println!("{}", num);
    }

    foo.push(6);
}

Playground

When running this code, the following error is raised:

error[E0382]: use of moved value: `foo`
 --> src/main.rs:8:5
  |
4 |     for num in foo {
  |                --- value moved here
...
8 |     foo.push(6);
  |     ^^^ value used here after move
  |
  = note: move occurs because `foo` has type `std::vec::Vec<i32>`, which does not implement the `Copy` trait

This error rises because Rust for loops take ownership of the iterator in question, specifically via the IntoIterator trait. The for loop in the above code can be equivalently written as for num in foo.into_iter().

Note the signature of into_iter(). It takes self rather than &self; in other words, ownership of the value is moved into the function, which creates an iterator for use in the for loop, and the generated iterator is dropped at the end of the loop. Hence why the above code fails: We are attempting to use a variable which was "handed over" to something else. In other languages, the typical term used is that the value used for the loop is consumed.

Acknowledging this behavior gets us to the root of your problem, namely the "move" in cannot move out of borrowed content. When you have a reference, like lines (a reference to a Vec), you have only that - a borrow. You do not have ownership of the object, and therefore you cannot give ownership of that object to something else, which could cause memory errors that Rust is designed to prevent. Dereferencing lines effectively says "I want to give the original vector to this loop", which you can't do, since the original vector belongs to someone else.

Loosely speaking - and I may be wrong on this front, someone please correct me if I am - but dereferencing in Rust is, in most cases, only useful for modifying the object in the left-hand side of an assignment, since basically any use of a dereference in the right-hand side of an expression will try to move the item. For example:

fn main() {
    let mut foo = vec![1, 2, 3, 4, 5];
    println!("{:?}", foo);

    {
        let num_ref: &mut i32 = &mut foo[2]; // Take a reference to an item in the vec
        *num_ref = 12; // Modify the item pointed to by num_ref
    }

    println!("{:?}", foo);
}

Playground

The above code will print:

[1, 2, 3, 4, 5]
[1, 2, 12, 4, 5]

The how

So the unfortunate truth is that there is no way to use the dereference in this case. But you're in luck - there's an easy way to solve your issue of type mismatch. The handy trait, Clone, defines a function called clone() that is expected to create an entirely new instance of the type, with the same values. Most basic types in Rust implement Clone, including String. So with a single function call, your type mismatch woes go away:

fn select_lines(pattern: &String, lines: &Vec<String>) -> Vec<String> {
    let mut selected_lines: Vec<String> = Vec::new();

    for line in lines {
        if line.contains(pattern) {
            // 'line.clone()' will create a new String for
            // each line that contains the pattern, and
            // place it into the vector.
            selected_lines.push(line.clone());
        }
    }

    selected_lines
}

Playground, with an example

clone() is your friend, and you should get familiar with it, but be aware that it does require additional memory (up to double, if all the lines match), and that the lines placed into selected_lines cannot be easily linked back to their counterparts in lines. You shouldn't worry about the memory issue until you move out of experimentation and into production with very large datasets, but if the latter problem poses an issue, I'd like to point you to this alternative solution, which does edit the function signature in order to return references to the matching lines instead:

fn select_lines<'a>(pattern: &String, lines: &'a Vec<String>) -> Vec<&'a String> {
    let mut selected_lines: Vec<&'a String> = Vec::new();

    for line in lines {
        if line.contains(pattern) {
            selected_lines.push(&line);
        }
    }

    selected_lines
}

This example includes Lifetimes, which is something you likely won't need to learn for a while, but you're free to examine this example as you wish!

Playground link, with an example, and some mutability edits.

MutantOctopus
  • 3,431
  • 4
  • 22
  • 31