3

I'm using the nom parser to parse a language. On top of that I'm using nom_supreme for some quality of life improvements (e.g. error handling).

It is going well, but I'm stuck on one puzzle which I'm hoping that someone can help me with.

First for some background, the nom tag function returns a parser that consumes a string. For example:

fn parser(s: &str) -> IResult<&str, &str> {
  tag("Hello")(s)
}

assert_eq!(parser("Hello, World!"), Ok((", World!", "Hello")));

nom_supreme has a drop in equivalent with the same name that has some errors handling improvements (mainly that it embeds the tag in the error).

The the function signature is similar (I've reordered some of the types to make them easier to compare):

// nom_supreme https://github.com/Lucretiel/nom-supreme/blob/f5cc5568c60a853e869c784f8a313fb5c6151391/src/tag.rs#L93

pub fn tag<T, I, E>(tag: T) -> impl Clone + Fn(I) -> IResult<I, I, E>
where
  T: InputLength + Clone,
  I: InputTake + Compare<T>,
  E: TagError<I, T>

vs

// nom https://github.com/rust-bakery/nom/blob/90d78d65a10821272ce8856570605b07a917a6c1/src/bytes/complete.rs#L32

pub fn tag<T, I, E>(tag: T) -> impl Fn(I) -> IResult<I, I, E>
where
  T: InputLength + Clone,
  I: Input + Compare<T>,
  E: ParseError<I>
{

At a superficial level, they work the same. The difference occurs when I use the nom_supreme parser in a closure.

This example with nom compiles:

pub fn create_test_parser(captured_tag: &str) -> impl FnMut(&str) -> AsmResult<String> + '_ {
    move |i: &str| {
        let captured_tag_parser = nom::bytes::complete::tag(captured_tag);
        let (i, parsed_tag) = captured_tag_parser(i)?;
        Ok((i, String::from(parsed_tag)))
    }
}

but this example with nom_supreme fails with an error:

lifetime may not live long enough returning this value requires that '1 must outlive 'static

pub fn create_test_parser(captured_tag: &str) -> impl FnMut(&str) -> AsmResult<String> + '_ {
    move |i: &str| {
        let captured_tag_parser = nom_supreme::tag::complete::tag(captured_tag);
        let (i, parsed_tag) = captured_tag_parser(i)?;
        Ok((i, String::from(parsed_tag)))
    }
}

I've tried:

  1. Cloning "captured_tag" -> got a "using 'clone' on a double reference" error
  2. captured_tag.to_owned() -> got a "returns a value referencing data owned by the current function" error
  3. Cloning "captured_tag" in outer scope -> same lifetime error
  4. captured_tag.to_owned() in outer scope -> got a "captured variable cannot escape FnMut" error
  5. Using "Arc", this works! but why do I need to resort to higher level memory management when the standard nom tag function works

I feel like I'm missing some way to transfer the ownership of that string into the closure. The string getting passed into create_test_parser is just a string literal, so it shouldn't really have a lifetime tied to the caller.

If you want to play around with it, a stripped down example project is at: https://github.com/NoxHarmonium/nom-parser-stack-overflow-example/tree/main

Sean Dawson
  • 5,587
  • 2
  • 27
  • 34

2 Answers2

2

While I was deconstructing the problem to ask the question I managed to solve my own issue. The solution is in the error message.

returning this value requires that '1 must outlive 'static

I just need to make sure that the string getting captured had a static lifetime, which is fine for my use case because all the inputs to the function are string literals.

pub fn create_test_parser(captured_tag: &'static str) -> impl FnMut(&str) -> AsmResult<String> + '_ {
    move |i: &str| {
        let captured_tag_parser = nom_supreme::tag::complete::tag(captured_tag);
        let (i, parsed_tag) = captured_tag_parser(i)?;
        Ok((i, String::from(parsed_tag)))
    }
}

However, this might not solve the problem for everyone, and I'm still unsure what the difference is between the nom version of tag and the nom_supreme version which causes this requirement. I'd love it if someone had a more insightful answer!

Sean Dawson
  • 5,587
  • 2
  • 27
  • 34
  • I don't get what you don't get :p nom_supreme have a generic and if this generic contains a lifetime you can't use temporary item like &'1 str cause it would not be able to borrow it as long as it need. – Stargateur Jun 27 '23 at 21:48
  • I'm new to rust and having type parameters as well as lifetime parameters is a bit confusing to me, but I'm learning slowly. Do you mean the the nom_supreme parser's type signature has a T in it `TagError` and the regular nom parser doesn't `ParseError` ? I did miss that the first time. It isn't obvious to me why that would default to a lifetime of 'static. Where does 'static come from? Edit: Ah found it thanks to @isaactfa 's answer, its specified in https://github.com/Lucretiel/nom-supreme/blob/f5cc5568c60a853e869c784f8a313fb5c6151391/src/error.rs#L377 – Sean Dawson Jun 28 '23 at 10:11
1

Apparently this is obvious, but here is my analysis of this:

pub fn tag<T, I, E>(tag: T) -> impl Clone + Fn(I) -> IResult<I, I, E>
where
    T: InputLength + Clone,
    I: InputTake + Compare<T>,
    E: TagError<I, T>

The thing to note here is the bound E: TagError<I, T>. You pass ErrorTree<&'_ str> in as E. ErrorTree<&'_ str> is a type alias for GenericErrorTree<&'_ str, &'static str, &'static str, Box<dyn Error + Send + Sync + 'static>>.

The trait TagError<I, T> is implemented for GenericErrorTree<I, T: AsRef<[u8]>, C, E>. Notice: Because of the way that ErrorTree is defined, the T here is inferred to be &'static str. T is also the type that the tag function takes as input, i.e the type of captured_tag, so it's forced to be &'static str.

Now, I don't know anything about nom_supreme so there might be very good reasons for this to be the case. The docs even say "T is typically something like &'static str or &'static [u8]."

But: you can define your own ErrorTree alias that's generic over T:

type ErrorTree<I, T> = nom_supreme::error::GenericErrorTree<
    I,
    T,
    &'static str,
    Box<dyn std::error::Error + Send + Sync + 'static>,
>;

and consequently

pub type AsmResult<'a, 'b, O> = IResult<&'a str, O, ErrorTree<&'a str, &'b str>>;

and your function signature becomes

pub fn create_test_parser<'a, 'b>(
    captured_tag: &'a str,
) -> impl FnMut(&'b str) -> AsmResult<'b, 'a, String> + 'a

and it works.

Again, this might break some other nom_supreme API down the pipeline, I don't know, but these lifetimes look and feel more correct to me.

isaactfa
  • 5,461
  • 1
  • 10
  • 24
  • Thanks so much for explaining this for me! I totally missed that `ErrorTree` resolved to a type with static lifetimes in it. I feel like it is that way for a reason, but I'll try your type at some stage and if it works maybe I'll raise an issue against nom_supreme. – Sean Dawson Jun 28 '23 at 10:18
  • 1
    I'm glad it helps. I guess there's not a lot of reasons that a tag you match against shouldn't be `'static`, you're probably not generating your parsing rules at runtime. And as long as you _use_ `&'static str`s as your input to `create_test_parser`, it'll be exactly the same as _requiring_ them to be `&'static str`. – isaactfa Jun 28 '23 at 10:26