5

I'm learning rust and ran into the problem. I have this MCVE:

fn main() {
    let mut line = String::new();
    std::io::stdin()
        .read_line(&mut line)
        .expect("Failed to read line");

    handle_tokens( line.split_ascii_whitespace() );
}

fn handle_tokens( mut it: std::str::SplitAsciiWhitespace ) {
    loop {
        match it.next() {
            None => return,
            Some(s) => println!("{}",s),
        }
    }
}

String::split_ascii_whitespace returns a SplitAsciiWhitespace object so I've used that in the signature of handle_tokens, but std::str::SplitAsciiWhitespace is an extremely specific type. A generic iterator to a list of strings makes more sense, so that I can choose split_whitespace or maybe just a generic list of strings.

How can I use documentation or compiler errors to generalize the signature of handle_tokens?


Here's my failed attempt to answer the question on my own:

I can see that SplitAsciiWhitespace "Trait Implementations" include:

impl<'a> Iterator for SplitWhitespace<'a>

This is where next() comes from (I had to inspect source code to verify that). Therefore, I tried using an iterator with fn handle_tokens( mut it: Iterator ) { but:

error[E0191]: the value of the associated type `Item` (from trait `std::iter::Iterator`) must be specified
  --> src/main.rs:10:27
   |
10 | fn handle_tokens( mut it: Iterator ) {
   |                           ^^^^^^^^ help: specify the associated type: `Iterator<Item = Type>`

Ok, so Iterator is too generic to use... I need to tell the compiler what it's wrapping. That makes sense, otherwise I wouldn't be able to dereference it. I had to look in the source code again to see how SplitWhitespace implements an Iterator and saw type Item = &'a str; so I tried to specify the Item with fn handle_tokens( mut it: Iterator<Item = &str>), but:

error[E0277]: the size for values of type `(dyn std::iter::Iterator<Item = &str> + 'static)` cannot be known at compilation time
  --> src/main.rs:10:19
   |
10 | fn handle_tokens( mut it: Iterator<Item = &str> ) {
   |                   ^^^^^^ doesn't have a size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `(dyn std::iter::Iterator<Item = &str> + 'static)`
   = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
   = note: all local variables must have a statically known size
   = help: unsized locals are gated as an unstable feature

Ok, so I need to specify a size as well. That's strange because while I know the size of str can't be known at compile-time, the size of &str should be.

At this point I'm very stuck. I'm also surprised that source-code inspection is necessary when Rust seems to provide such a great built-in documentation support. That makes me think that the method I'm using to answer this question is wrong.

Stewart
  • 4,356
  • 2
  • 27
  • 59
  • Does this answer your question? [How to properly pass Iterators to a function in Rust](https://stackoverflow.com/questions/57543399/how-to-properly-pass-iterators-to-a-function-in-rust) – E_net4 Jul 16 '20 at 09:34
  • At my (beginner) level, it's hard to comprehend that answer. I'm sure it's not wrong, but adding the `where` `IntoIterator` and `Borrow` isn't something I've gotten to yet. @Kitsu's answer is very clear. – Stewart Jul 16 '20 at 09:40

2 Answers2

4

You at the right path actually. next is indeed defined at the Iterator and that is the one you need to use. The thing you missed is that Iterator is a *trait` actually, not a type. Type can be bounded by a trait, so here generics come handy:

fn handle_tokens<'a, I: Iterator<Item = &'a str>>(mut it: I) { .. }

There's also a special impl-trait syntax which can be used instead:

fn handle_tokens<'a>(mut it: impl Iterator<Item = &'a str>) { .. }

However, the last example cannot be called with an explicitly specified type, i.e. handle_tokens::<SplitAsciiWhitespace>(iter)

Kitsu
  • 3,166
  • 14
  • 28
  • 1
    In practice, you would generally use `IntoIterator` instead of `Iterator` if you want the funciton to be as generic as possible. – Sven Marnach Jul 16 '20 at 13:16
3

fn handle_tokens uses fn next from Iterator trait and require Display trait on the items of the Iterator, so you can make this function generic.

use std::fmt::Display;
fn handle_tokens<T>(mut tokens: T)
where
    T: Iterator,
    <T as Iterator>::Item: Display,
{
    loop {
        match tokens.next() {
            None => return,
            Some(s) => println!("{}", s),
        }
    }
}

Or you can .collect() iterator

let tokens = line.split_ascii_whitespace().collect::<Vec<_>>()

I see you tried using dyn. It's called trait objects.


fn handle_tokens3(it: &mut dyn Iterator<Item = &str>) {
    loop {
        match it.next() {
            None => return,
            Some(s) => println!("{}", s),
        }
    }
}

Link to the playground

  • I tried `dyn` because a warning suggested that. I'm just doing the exercizes in chapter 8 of the same book (you linked to chap 17). Good to know this will all make sense soon. – Stewart Jul 16 '20 at 09:56
  • 2
    Beware that using `dyn` has consequences / runtime cost, it means you're dealing with an object through a virtual function table (vtable) and method calls on such an object will be indirect. Use it only if you know that that's what you need. – Jesper Jul 16 '20 at 10:36