2

I'm new to Rust and Nom and I'm trying to parse a (single) quoted string which may contain escaped quotes, e.g. 'foo\' bar' or 'λx → x', '' or ' '.

I found the escaped! macro, whose documentation says:

The first argument matches the normal characters (it must not accept the control character), the second argument is the control character (like \ in most languages), the third argument matches the escaped characters

Since I want to match anything but a backslash in the matcher for “normal characters”, I tried using take_till!:

    named!(till_backslash<&str, &str>, take_till!(|ch| ch == '\\'));
    named!(esc<&str, &str>, escaped!(call!(till_backslash), '\\', one_of!("'n\\")));

    let (input, _) = nom::character::complete::char('\'')(input)?;
    let (input, value) = esc(input)?;
    let (input, _) = nom::character::complete::char('\'')(input)?;

    // … use `value`

However, when trying to parse 'x', this returns Err(Incomplete(Size(1))). When searching for this, people generally recommend using CompleteStr, but that's not in Nom 5. What's the correct way to approach this problem?

beta
  • 2,380
  • 21
  • 38
  • It seems, that you are (still) using nom4. There's a new version (5) which kind of obsoletes the macros, e.g. `names!` and uses functions instead. You can take a look at https://docs.rs/nom/5.0.1/nom/ – hellow Nov 17 '19 at 20:11
  • @hellow: Maybe's that's where the disconnect is? I could very well be misunderstanding something… I *am* actually using Nom 5.0.1, but I'm still using many of the macros because if I go to https://docs.rs/nom/5.0.1/nom/character/complete/index.html, there are very few combinators, and nothing like `escaped` or `take_till`. For bytes there *is* an `escaped` function (https://docs.rs/nom/5.0.1/nom/bytes/complete/fn.escaped.html), but I'm not working on bytes, I'm working on (unicode) text. – beta Nov 17 '19 at 20:15

1 Answers1

1

When operating in the so-called streaming mode, nom may returns Incomplete to indicate that it can't decide and needs more data. The nom 4 introduced CompleteStr. Alongside with CompleteByteSlice, they were complete input counterpart of &str and &[u8]. The parsers taken them as input work in complete mode.

They are gone in nom 5. In nom 5, macro based parsers always work in streaming mode as you've observed. For parser combinators that would work differently in streaming and complete mode, there are different versions of them in separate sub-modules, such as nom::bytes::streaming and nom::bytes::complete.

For all these gory details you may want to check out this blog post, especially the section Streaming VS complete parsers.

Also, the function combinators are preferred over the macro ones in nom 5. Here is one way to do it:

//# nom = "5.0.1"
use nom::{
    branch::alt,
    bytes::complete::{escaped, tag},
    character::complete::none_of,
    sequence::delimited,
    IResult,
};

fn main() {
    let (_, res) = parse_quoted(r#"'foo\'  bar'"#).unwrap();
    assert_eq!(res, r#"foo\'  bar"#);
    let (_, res) = parse_quoted("'λx → x'").unwrap();
    assert_eq!(res, "λx → x");
    let (_, res) = parse_quoted("'  '").unwrap();
    assert_eq!(res, "  ");
    let (_, res) = parse_quoted("''").unwrap();
    assert_eq!(res, "");
}

fn parse_quoted(input: &str) -> IResult<&str, &str> {
    let esc = escaped(none_of("\\\'"), '\\', tag("'"));
    let esc_or_empty = alt((esc, tag("")));
    let res = delimited(tag("'"), esc_or_empty, tag("'"))(input)?;

    Ok(res)
}
edwardw
  • 12,652
  • 3
  • 40
  • 51
  • Thanks! I was under the (false) impression that the functions from `nom::bytes` woudn't work for `&str`, but it seems that it even works if I change the delimiters from single quotes to e.g. ! – beta Nov 18 '19 at 10:51