2

Continuing from How do I write combinators for my own parsers in Rust?, I stumbled into this question concerning bounds of functions that consume and/or yield functions/closures.

From these slides, I learned that to be convenient for consumers, you should try to take functions as FnOnce and return as Fn where possible. This gives the caller most freedom what to pass and what to do with the returned function.

In my example, FnOnce is not possible because I need to call that function multiple times. While trying to make it compile I arrived at two possibilities:

pub enum Parsed<'a, T> {
    Some(T, &'a str),
    None(&'a str),
}

impl<'a, T> Parsed<'a, T> {
    pub fn unwrap(self) -> (T, &'a str) {
        match self {
            Parsed::Some(head, tail) => (head, &tail),
            _ => panic!("Called unwrap on nothing."),
        }
    }

    pub fn is_none(&self) -> bool {
        match self {
            Parsed::None(_) => true,
            _ => false,
        }
    }
}

pub fn achar(character: char) -> impl Fn(&str) -> Parsed<char> {
    move |input|
        match input.chars().next() {
            Some(c) if c == character => Parsed::Some(c, &input[1..]),
            _ => Parsed::None(input),
        }
}

pub fn some_v1<T>(parser: impl Fn(&str) -> Parsed<T>) -> impl Fn(&str) -> Parsed<Vec<T>> {
    move |input| {
        let mut re = Vec::new();
        let mut pos = input;
        loop {
            match parser(pos) {
                Parsed::Some(head, tail) => {
                    re.push(head);
                    pos = tail;
                }
                Parsed::None(_) => break,
            }
        }
        Parsed::Some(re, pos)
    }
}

pub fn some_v2<T>(mut parser: impl FnMut(&str) -> Parsed<T>) -> impl FnMut(&str) -> Parsed<Vec<T>> {
    move |input| {
        let mut re = Vec::new();
        let mut pos = input;
        loop {
            match parser(pos) {
                Parsed::Some(head, tail) => {
                    re.push(head);
                    pos = tail;
                }
                Parsed::None(_) => break,
            }
        }
        Parsed::Some(re, pos)
    }
}

#[test]
fn try_it() {
    assert_eq!(some_v1(achar('#'))("##comment").unwrap(), (vec!['#', '#'], "comment"));
    assert_eq!(some_v2(achar('#'))("##comment").unwrap(), (vec!['#', '#'], "comment"));
}

playground

Now I don't know which version is to be preferred. Version 1 takes Fn which is less general, but version 2 needs its parameter mutable.

Which one is more idiomatic/should be used and what is the rationale behind?


Update: Thanks jplatte for the suggestion on version one. I updated the code here, that case I find even more interesting.

primfaktor
  • 2,831
  • 25
  • 34
  • 1
    Binding an argument as mutable doesn't imply any constraint on the caller. It's the same as if you had left out that `mut` in the function signature and had `let mut parser = parser;` as the first line in `some_v2`. See [Why can immutable variables be passed as arguments to functions that require mutable arguments?](https://stackoverflow.com/q/54120899/4639273) and [What's the difference between placing “mut” before a variable name and after the “:”?](https://stackoverflow.com/q/28587698/4639273). – SCappella Feb 15 '20 at 12:39

1 Answers1

1

Comparing some_v1 and some_v2 as you wrote them I would say version 2 should definitely be preferred because it is more general. I can't think of a good example for a parsing closure that would implement FnMut but not Fn, but there's really no disadvantage to parser being mut - as noted in the first comment on your question this doesn't constrain the caller in any way.

However, there is a way in which you can make version 1 more general (not strictly more general, just partially) than version 2, and that is by returning impl Fn(&str) -> … instead of impl FnMut(&str) -> …. By doing that, you get two functions that each are less constrained than the other in some way, so it might even make sense to keep both:

  • Version 1 with the return type change would be more restrictive in its argument (the callable can't mutate its associated data) but less restrictive in its return type (you guarantee that the returned callable doesn't mutate its associated data)
  • Version 2 would be less restrictive in its argument (the callable is allowed to mutate its associated data) but more restrictive in its return type (the returned callable might mutate its associated data)
jplatte
  • 1,121
  • 11
  • 21
  • 1
    See updated question, thanks. With that change, version 1 seems more appropriate for combining (hopefully) stateless parsers, correct? – primfaktor Feb 18 '20 at 08:53
  • Yes, exactly. For example you could then pass `some_v1` to itself (which isn't particularly useful) or a similar combinator (e.g. `fn exactly(times: usize, parser: impl Fn(&str) -> Parsed) -> impl Fn(&str) -> Parsed>`). – jplatte Feb 18 '20 at 09:24