2

Many iterator methods is Rust generate iterators wrapped up in iterators. One such case is the skip method, that skips the given number of elements and yields the remaining ones wrapped in the Skip struct that implements the Iterator trait.

I would like to read a file line by line, and sometimes skip the n first characters of a line. I figured that using Iterator.skip would work, but now I'm stuck figuring out how I can actually unwrap the yielded Chars iterator so I could materialize the remaining &str with chars.as_str().

What is the idiomatic way of unwrapping an iterator in rust? The call chain

let line: &String = ...;
let remaining = line.chars().skip(n).as_str().trim();

raises the error

error[E0599]: no method named `as_str` found for struct `std::iter::Skip<std::str::Chars<'_>>` in the current scope
   --> src/parser/directive_parsers.rs:367:63
    |
367 |         let option_val = line.chars().skip(option_val_indent).as_str().trim();
    |                                                               ^^^^^^ method not found in `std::iter::Skip<std::str::Chars<'_>>`

error: aborting due to previous error
Peter Hall
  • 53,120
  • 14
  • 139
  • 204
sesodesa
  • 1,473
  • 2
  • 15
  • 24
  • `Skip` is just an iterator over the items of the underlying iterator but without the first n items. Just use it like an iterator. The nested iterator in your situation sounds unrelated - it's because you are iterating over lines and then over characters. – Peter Hall Aug 19 '20 at 09:18
  • 3
    `Skip` is an iterator. You need to `.collect()` it. – L. Riemer Aug 19 '20 at 09:24
  • 2
    It actually sounds like you don't really want to be iterating over chars at all. You probably just want to take a slice of the strings? Doing it via chars you will need to collect into a a `String`, which seems like unnecessary allocation. – Peter Hall Aug 19 '20 at 09:26
  • @PeterHall Yeah, taking slices would be wonderful, but I don't know the byte indices of my characters beforehand. That's why I've been wasteful and dealing mostly with `Chars` iterators. – sesodesa Aug 19 '20 at 09:29
  • 1
    Meaning you must handle all valid UTF-8? Depending on your use case, manually walking `.char_indices()` might be a significant improvement. Especially since it saves you an additional allocation. Smells like premature optimization though, make a note in the code and move on. – L. Riemer Aug 19 '20 at 09:37

2 Answers2

5

You can retrieve the start byte index of the nth character using the nth() method on the char_indices() iterator on the string. Once you have this byte index, you can use it to get a subslice of the original string:

let line = "This is a line.";
let index = line.char_indices().nth(n).unwrap().0;
let remaining = &line[index..];
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
3

Rather than iterate over chars, you can use char_indices to find the exact point at which to take a slice from the string, ensuring that you don't index into the middle of a multi-byte character. This will save on an allocation for each line in the iterator:

input
    .iter()
    .map(|line| {
        let n = 2; // get n from somewhere?
        let (index, _) = line.char_indices().nth(n).unwrap();// better error handling
        &line[index..]
    })
Peter Hall
  • 53,120
  • 14
  • 139
  • 204