Spirit Grammar For Path Verificiation

Question

I am trying to write a simple grammar using boost spirit to validate that a string is a valid directory. I am using these tutorials since this is the first grammar I have attempted: http://www.boost.org/doc/libs/1_36_0/libs/spirit/doc/html/spirit/qi_and_karma.html http://www.boost.org/doc/libs/1_48_0/libs/spirit/doc/html/spirit/qi/reference/directive/lexeme.html http://www.boost.org/doc/libs/1_44_0/libs/spirit/doc/html/spirit/qi/tutorials/employee___parsing_into_structs.html

Currently, what I have come up with is:

// I want these to be valid matches
std::string valid1 = "./";
// This string could be any number of sub dirs i.e. /home/user/test/ is valid
std::string valid2 = "/home/user/";

using namespace boost::spirit::qi;
bool match = phrase_parse(valid1.begin(), valid1.end(), lexeme[
    ((char_('.') | char_('/')) >> +char_ >> char_('/')],
    ascii::space);
if (match)
{
    std::cout << "Match!" << std::endl;
}

However, this matches nothing. I had a few ideas as to why; however, after doing some research I haven't found the answers. For example I assume the +char_ will probably consume all chars? So how can I find out if some sequence of characters all end with /?

Essentially my thoughts behind writing the above code was I want directories starting with . and / to be valid and then the last character has to be a /. Could someone help me with my grammar or point me to something more similar example to what I want to do? This is purely an excise to learn how to use spirit.

Edit So I have got the parser to match using:

bool match = phrase_parse(valid1.begin(), valid1.end(), lexeme[
    ((char_('.') | char_('/')) >> *(+char_ >> char_('/'))],
    ascii::space);
if (match)
{
    std::cout << "Match!" << std::endl;
}

Not sure if that is proper or not? Or if it is matching for other reasons... Also should the ascii::space be used here? I read in a tutorial that it was to make spaces agnostic i.e. a b is equivalent to ab. Which I wouldn't want in a path name? If it isn't the correct thing to use what would be?

SSCCE:

#include <string>
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_char.hpp>
#include <boost/spirit/include/qi_eoi.hpp>

int main()
{
  namespace qi = boost::spirit::qi;
  std::string valid1 = "./";
  std::string valid2 = "/home/blah/";
  bool match = qi::parse(valid2.begin(), valid2.end(), &((qi::lit("./")|'/') >> (+~qi::char_('/') % '/') >> qi::eoi));

  if (match)
  {
    std::cout << "Match" << std::endl;
  }
}

score 2 · Accepted Answer · answered Mar 31 '16 at 14:49

2

If you don't want to ignore space differences (which you shouldn't), use parse instead of phrase_parse. The use of lexeme inhibits the skipper again (so you were just stripping leading/trailing space). See also stackoverflow.com/questions/17072987/boost-spirit-skipper-issues/17073965#17073965

Use char_("ab") instead of char_('a')|char_('b').

*char_ matches everything. You may have meant *~char_('/').

I'd suggest something like

 bool ok = qi::parse(b, f, &(lit("./")|'/') >> (*~char_('/') % '/'));

This won't expose the matched input. Add raw[] around it to achieve that .

Add > qi::eoi to assert all of the input was consumed.

answered Mar 31 '16 at 14:49

sehe

374,641
47
450
633

I am having some trouble finding documentation on what *~char__('/') % '/' means could you explain? – joshu Mar 31 '16 at 16:57
I also realized this would match /home//blah also which should be excluded. – joshu Mar 31 '16 at 17:06
It's in the same page that documents the character parsers – sehe Mar 31 '16 at 17:50
Change * to + to require at least 1 character between path separators – sehe Mar 31 '16 at 17:51
I didn't see anything about the % operator on this page: http://www.boost.org/doc/libs/1_46_1/libs/spirit/doc/html/spirit/qi/reference/char/char.html Also adding a + instead of a * makes it not match again. – joshu Mar 31 '16 at 18:43
Of course % is not listed with the character parsers. Look at the [operators](http://www.boost.org/doc/libs/1_60_0/libs/spirit/doc/html/spirit/qi/reference/operator.html) section instead. If + "makes it not parse" then you're doing something else differently. If you show the SSCCE I could help – sehe Mar 31 '16 at 19:12
I just added it. Thanks for your help! – joshu Mar 31 '16 at 19:59
Okay, yes your test input ends in a slash. If you want to still allow that (i.e. "empty" tail path element) then you need to say it: `>> -qi::lit('/') >> qi::eoi` – sehe Mar 31 '16 at 20:38
I was thinking that may be the problem. That fixed it. Thanks for all of your help! – joshu Mar 31 '16 at 20:56

Spirit Grammar For Path Verificiation

1 Answers1