I'm trying to wrap my head around the nom package. The simple problem I'm trying to solve is to write a parser that can count the comment lines in a file. I've two types of comments to parse for:
- Single line comments using
//
- Multi-line comments using
/* ... */
Here is the code I have so far:
use nom::{
Err, IResult, Parser,
branch::alt,
bytes::complete::{is_not, tag, take_until},
character::complete::{char, line_ending},
combinator::{value, eof, map, not, success, all_consuming},
error::{ErrorKind, ParseError},
multi::many0,
sequence::{pair, tuple, preceded, delimited, terminated},
};
// matches on single line comment
fn single_line_comment(s: &str) -> IResult<&str,&str> {
preceded( tag("//"),is_not("\n\r"))(s)
}
// returns `1` for single line comment match
pub fn count_single_line_comment(i: &str) -> IResult<&str, usize> {
value(1,
single_line_comment
)(i)
}
// matches on a multi-line comment
fn multi_line_comments(i: &str) -> IResult<&str, &str> {
delimited(tag("/*"),take_until("*/"),tag("*/"))(i)
}
// returns the line count of a multi-line comment
fn count_multi_line_comments(i: &str) -> IResult<&str, usize> {
map(multi_line_comments, |s| s.lines().count()) (i)
}
// helper parser that matches on either of the two comment types
fn _count_comment_lines(s: &str) -> IResult<&str,usize> {
alt((count_single_line_comment, count_multi_line_comments))(s)
}
// function I would like to write but I can't figure this part out
pub fn count_comment_lines(s: &str) -> IResult<&str, Vec<usize>> {
many0(
alt((_count_comment_lines,
preceded(
not(_count_comment_lines),
_count_comment_lines
)))
)(s)
}
My logic for the last parser is we're going to match many times on either a comment block, or something that leads up to a comment block or the rest of the file. But it doesn't work. Trying this on my sample strings doesn't yield the values I want.
static SAMPLE1: &str = "//Hello there!\n//What's going on?";
static SAMPLE2: &str = "/* What's the deal with airline food?\nIt keeps getting worse and worse\nI can't take it anymore!*/";
static SAMPLE3: &str = " //Global Variable\nlet x = 5;\n/*TODO:\n\t// Add the number of cats as a variable\n\t//Shouldn't take too long\n*/";
static SAMPLE4: &str = "//First\n//Second//NotThird\n//Third";
In my head, I want to use take_until
but that doesn't take a parser, it takes a tag. I don't see anything like take_until
for parsers so I'm thinking I have to rethink the combinator approach generally. What do you suggest?