1

I have the following definition for an object record in PureData that I need to be able to parse into my generic PdObject struct:

Description:
Defines an object
Syntax:
#X obj [x_pos] [y_pos] [object_name] [p1] [p2] [p3] [...];\r\n
Parameters:
[x_pos] - horizontal position within the window
[y_pos] - vertical position within the window
[object_name] - name of the object (optional)
[p1] [p2] [p3] [...] the parameters of the object (optional)
Example:
#X obj 55 50;
#X obj 132 72 trigger bang float;

And I have created the following boost spirit rule that has been tested to work:

template <typename Iterator> struct PdObjectGrammar : qi::grammar<Iterator, PdObject()> { 
    PdObjectGrammar() : PdObjectGrammar::base_type(start) { 
        using namespace qi; 
        start = skip(space)[objectRule]; 
        pdStringRule = +(('\\'  >> space) | (graph-lit(";"))); 
        objectRule = "#X obj" >> int_ >> int_ >> -(pdStringRule) >> *(pdStringRule) >> ";"; 
        BOOST_SPIRIT_DEBUG_NODES((start)(objectRule)(pdStringRule))
    }
    private: 
    qi::rule<Iterator, std::string()> pdStringRule; 
    qi::rule<Iterator, PdObject()> start; 
    qi::rule<Iterator, PdObject(), qi::space_type> objectRule; 

};

However, there are also special "reserved names" that cannot be used, such as "bng," "tgl," "nbx," etc...

For example, here is another type of "obj" using a reserved name keyword that must be parsed separately by a different rule:

#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;

How can I modify my previous qi rule to not parse the above string, and leave it for another grammar to check (which would parse it to a different struct)?

Postscript:

My full test for the PdObjectGrammar is:

#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>

#include <string> 
#include <vector>
#include <fstream>


namespace qi = boost::spirit::qi;

struct PdObject {
int xPos;
int yPos;
std::string name;
std::vector<std::string> params;

};


BOOST_FUSION_ADAPT_STRUCT(
    PdObject,
    xPos,
    yPos,
    name,
    params
)

template <typename Iterator> struct PdObjectGrammar : qi::grammar<Iterator, PdObject()> { 
    PdObjectGrammar() : PdObjectGrammar::base_type(start) { 
        using namespace qi; 
        start = skip(space)[objectRule]; 
        pdStringRule = +(('\\'  >> space) | (graph-lit(";"))); 
        objectRule = "#X obj" >> int_ >> int_ >> -(pdStringRule) >> *(pdStringRule) >> ";"; 
        BOOST_SPIRIT_DEBUG_NODES((start)(objectRule)(pdStringRule))
    }
    private: 
    qi::rule<Iterator, std::string()> pdStringRule; 
    qi::rule<Iterator, PdObject()> start; 
    qi::rule<Iterator, PdObject(), qi::space_type> objectRule; 

};


int main(int argc, char** argv)
{
  if(argc != 2)
    {
        std::cout << "Usage: "  <<argv[0] << " <PatchFile>" << std::endl;
        exit(1); 
    }

    std::ifstream inputFile(argv[1]); 
    std::string inputString(std::istreambuf_iterator<char>(inputFile), {}); 

    PdObject msg;
    PdObjectGrammar<std::string::iterator> parser; 

    bool success = qi::phrase_parse(inputString.begin(), inputString.end(), parser, boost::spirit::ascii::space, msg); 
    std::cout << "Success: " << success << std::endl;

    return 0; 

}
stix
  • 1,140
  • 13
  • 36
  • as a Pd-guy speaking: if you consider `bng` and `tgl` and what-not to be *special "reserved names"*, then your understanding of Pd is flawed - these names fall in the same category as `float`, `fiddle~` and `shubidoo` (that is: they are in no way special). this of course doesn't make the general question any less interesting. – umläute May 16 '23 at 21:09
  • @umläute they change the form of the #X obj grammar and result in special graphics on the PD canvas. I don't see how you can say they aren't "special" compared to any regular object. They need special parsing consideration when loading a patch, which is why this question was formed in the first place. – stix May 17 '23 at 21:43
  • they don't. they are ordinary objects, but their functionality includes "drawing things" (unlike most other objects), just like the functionality of `fiddle~` includes some frequency-domain analysis (unlike most other objects!). from the perspective of the on-disk representation there is no difference. or put otherwise: the `shubidoo` i mentioned above *might* show up as some fancy GUI-thingy or it might not - but the on-disk representation is always `#X shubidoo ...;` – umläute May 22 '23 at 08:35
  • @umläute If one hits ctrl + 1, and types "bng" on the PD canvas, you will get a bang object. It does not represent in a .pd file the same as if you hit ctrl +1 and type "osc~" There are multiple different definitions for "#X obj" in the only thing remotely approaching documentation of the .pd file format. You just said yourself their "functionality includes things..." that other objects do, so then they aren't "ordinary objects" from a parsing nor from a code standpoint. Q.E.D. – stix May 22 '23 at 19:50
  • i'm pretty sure i know what i'm talking about here. if you hit Ctrl+1 and type `bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000` this is the representation of the object (as saved on disk) - much the same as when typing `pack 0 0 0 0`. if you only type `bng` then the ordinary *bng* object uses default arguments and saves these to disk - no magic here: i've written abstractions that save arguments to disk that are different from the ones provided when creating them (admittedly with the help of externals) - do you want a list of them to include in your special-handling? – umläute May 22 '23 at 22:23
  • "There are multiple different definitions for "#X obj" in the only thing remotely approaching documentation of the .pd file format"... that's a problem of the documentation, not of the format. in reality, whatever goes after the `#X obj ` is a message to a hidden object, with the selector of the message (the object name) selecting the constructor callback and the rest (until the end of the FUDI message) being args. the object is free to save itself as it wants to (most built-ins just don't care for anything extraordinary). Q.E.D.? – umläute May 22 '23 at 22:28
  • whoever volunteer wrote that "documentation" probably just picked those iemgui objects because you typically do not see their arguments (and there's no way to change them by imply editing them). but that page is not part of any official documentation, and there are *numerous* external objects out there that follow the same pattern. check `iem_gui`, `else`, `cyclone`, `moonlib`, `unauthorized`, `tof` to name just a couple of libs that are about 20 years old (except for `else`). as add-ons they obviously cannot be treated as "reserved" keywords. – umläute May 22 '23 at 22:37
  • @umläute I daresay if you guys knew what you were doing PureData would have proper documentation in the first place... – stix May 24 '23 at 18:32

1 Answers1

1

In a way "keywordness" is not part of the grammar. It's a semantic check.

There's not a standard way in which grammars deal with keywords. For example C++ has a number of identifiers that are contextually reserved only.

The short story of it is you will just have to express your constraints in code or validate semantics after-the-fact (on the parsed result).

Naively: Live

string     = +('\\' >> qi::space | qi::graph - ";");
name       = string - "bng" - "tgl" - "nbx" - "vsl" - "hsl" - "vradio" - "hradio" - "vu" - "cnv";
object     = "#X obj"       //
    >> qi::int_ >> qi::int_ //
    >> -name                //
    >> *string >> ";";

Or Live

string     = +('\\' >> qi::space | qi::graph - ";");
builtin    = qi::lit("bng") | "tgl" | "nbx" | "vsl" | "hsl" | "vradio" | "hradio" - "vu" | "cnv";
object     = "#X obj"        //
    >> qi::int_ >> qi::int_  //
    >> -(!builtin >> string) //
    >> *string >> ";";

Symbols

You can make this a bit more elegant, maintainable and possibly more efficient by defining a symbol for it: Live

qi::symbols<char> builtin;


// ...
builtin += "bng", "tgl", "nbx", "vsl", "hsl", "vradio", "hradio", "vu", "cnv";

string = +('\\' >> qi::space | qi::graph - ";");
object = "#X obj"                //
         >> qi::int_ >> qi::int_ //
         >> -(string - builtin)    //
         >> *string >> ";";

Distinct Keywords

There's a flaw. When the user names their object something starting with the builtin list, like bngalore or vslander the builtins will match so the name would be rejected: Live

To account for this, make sure we're on a lexeme boundary: Live

auto kw = [](auto const& p) { return qi::copy(qi::lexeme[p >> !(qi::graph - ';')]); };
string = +('\\' >> qi::space | qi::graph - ";");
object = "#X obj"                //
    >> qi::int_ >> qi::int_      //
    >> -(!kw(builtin) >> string) //
    >> *string >> ";";

It doesn't work!

That's because the grammar is flawed. In your defense, the specification is extremely sloppy. It's one of those grammars alright.

With all those things being optional, you should ask yourself, how does the parser know that name is omitted, when there are parameters? As far as I can see the parser could never tell, so when the name is omitted, there cannot be parameters?

We can express that: Live

string = +('\\' >> qi::space | qi::graph - ";");
object = "#X obj"                                   //
    >> qi::int_ >> qi::int_                         //
    >> !kw(builtin) >> -(string >> *string) >> ";"; //

Oh noes, now the entire (string >> *string) is compatible with just the name attribute...:

Input: "#X obj 132 72 trigger bang float;"
 -> (132 72 "triggerbangfloat" { })

Here I'd advise to adjust the AST to reflect the parsed grammar:

struct GenericObject {
    String              name;
    std::vector<String> params;
};

struct PdObject {
    int           xPos, yPos;
    GenericObject generic;
};

BOOST_FUSION_ADAPT_STRUCT(PdObject, xPos, yPos, generic)
BOOST_FUSION_ADAPT_STRUCT(GenericObject, name, params)

Now, it does propagate the attributes correctly: Live, note the extra sub-object (()) in the output:

Input: "#X obj 132 72 trigger bang float;"
 -> (132 72 ("trigger" { "bang" "float" }))

Taking It All The Way

As a pro tip, don't implement the parser in the same sloppy fashion as the specification was done. Likely, you just want to parse different object types with dedicated AST types and ditto rules.

For really advanced/pluggable grammars, you might dispatch the rules based on the name symbol. That's known as the Nabialek Trick.

Let's generalize our object rule:

object = "#X obj"           //
    >> qi::int_ >> qi::int_ //
    >> definition           //
    >> ";"                  //
    ;

Now let's demo the VSL rule, in addition to generic objects:

definition = vslider | generic;

Generic is still what we had before:

generic           //
    = opt(string) // name
    >> *string;   // params

Let's do a rough take on Vslider:

vslider                             //
    = qi::lexeme["vsl" >> boundary] //
    >> opt(qi::uint_)               // width
    >> opt(qi::uint_)               // height
    >> opt(qi::double_)             // bottom
    >> opt(qi::double_)             // top
    >> opt(bool_)                   // log
    >> opt(bool_)                   // init
    >> opt(string)                  // send
    >> opt(string)                  // receive
    >> opt(string)                  // label
    >> opt(qi::int_)                // x_off
    >> opt(qi::int_)                // y_off
    >> opt(string)                  // font
    >> opt(qi::uint_)               // fontsize
    >> opt(rgb)                     // bg_color
    >> opt(rgb)                     // fg_colo
    >> opt(rgb)                     // label_color
    >> opt(qi::double_)             // default_value
    >> opt(bool_)                   // steady_on_click
    ;

Of course we need a few helpers:

qi::uint_parser<int32_t, 16, 6, 6> hex6{};
rgb = ('#' >> hex6) | qi::int_;

auto boundary = qi::copy(!(qi::graph - ';'));
auto opt = [](auto const& p) { return qi::copy(p | &qi::lit(';')); };

bool_ = qi::bool_ | qi::uint_parser<bool, 2, 1, 1>{};

And the AST types:

struct RGB {
    int32_t rgb;
};

namespace Defs {
    using boost::optional;

    struct Generic {
        String              name;
        std::vector<String> params;
    };

    struct Vslider {
        optional<unsigned> width;           // horizontal size of gui element
        optional<unsigned> height;          // vertical size of gui element
        optional<double>   bottom;          // minimum value
        optional<double>   top;             // maximum value
        bool               log = false;     // when set the slider range is outputted
                                            // logarithmically, otherwise it's output
                                            // is linair
        String           init;              // sends default value on patch load
        String           send;              // send symbol name
        String           receive;           // receive symbol name
        optional<String> label;             // label
        int              x_off = 0;         // horizontal position of the label
                                            // text relative to the upperleft
                                            // corner of the object
        int y_off = 0;                      // vertical position of the label
                                            // text relative to the upperleft
                                            // corner of the object
        optional<String>   font;            // font type
        optional<unsigned> fontsize;        // font size
        optional<RGB>      bg_color;        // background color
        optional<RGB>      fg_color;        // foreground color
        optional<RGB>      label_color;     // label color
        optional<double>   default_value;   // default value times hundred
        optional<bool>     steady_on_click; // when set, fader is steady on click,
                                            // otherwise it jumps on click
    };

    using Definition = boost::variant<Vslider, Generic>;
} // namespace Defs

using Defs::Definition;

struct PdObject {
    int        xPos, yPos;
    Definition definition;
};

Putting it all together:

Full Demo

Live On Coliru

// #define BOOST_SPIRIT_DEBUG
#include <boost/core/demangle.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/optional/optional_io.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>

namespace Ast {
    // C++ makes it hard to pretty-print containers...
    struct print_hack : std::char_traits<char> {};
    using String = std::basic_string<char, print_hack>;
    static inline std::ostream& operator<<(std::ostream& os, String const& s) { return os << quoted(s); }
    static inline std::ostream& operator<<(std::ostream& os, std::vector<String> const& ss) {
        os << "{";
        for (auto& s : ss) os << " " << s;
        return os << " }";
    }

    struct RGB {
        int32_t rgb;
    };

    namespace Defs {
        using boost::optional;

        struct Generic {
            String              name;
            std::vector<String> params;
        };

        struct Vslider {
            optional<unsigned> width;           // horizontal size of gui element
            optional<unsigned> height;          // vertical size of gui element
            optional<double>   bottom;          // minimum value
            optional<double>   top;             // maximum value
            bool               log = false;     // when set the slider range is outputted
                                                // logarithmically, otherwise it's output
                                                // is linair
            String           init;              // sends default value on patch load
            String           send;              // send symbol name
            String           receive;           // receive symbol name
            optional<String> label;             // label
            int              x_off = 0;         // horizontal position of the label
                                                // text relative to the upperleft
                                                // corner of the object
            int y_off = 0;                      // vertical position of the label
                                                // text relative to the upperleft
                                                // corner of the object
            optional<String>   font;            // font type
            optional<unsigned> fontsize;        // font size
            optional<RGB>      bg_color;        // background color
            optional<RGB>      fg_color;        // foreground color
            optional<RGB>      label_color;     // label color
            optional<double>   default_value;   // default value times hundred
            optional<bool>     steady_on_click; // when set, fader is steady on click,
                                                // otherwise it jumps on click
        };

        using Definition = boost::variant<Generic, Vslider>;

        using boost::fusion::operator<<;
    } // namespace Defs

    using Defs::Definition;

    struct PdObject {
        int        xPos, yPos;
        Definition definition;
    };

    using boost::fusion::operator<<;
}

BOOST_FUSION_ADAPT_STRUCT(Ast::Defs::Vslider, width, height, bottom, top, log, init, send, receive, label,
                          x_off, y_off, font, fontsize, bg_color, fg_color, label_color, default_value,
                          steady_on_click)
BOOST_FUSION_ADAPT_STRUCT(Ast::Defs::Generic, name, params)
BOOST_FUSION_ADAPT_STRUCT(Ast::RGB, rgb)
BOOST_FUSION_ADAPT_STRUCT(Ast::PdObject, xPos, yPos, definition)

namespace qi = boost::spirit::qi;

template <typename Iterator> struct PdObjectGrammar : qi::grammar<Iterator, Ast::PdObject()> {
    PdObjectGrammar() : PdObjectGrammar::base_type(start) {
        start = qi::skip(qi::blank)[ object ];

        /* #X obj [x_pos] [y_pos] [object_name] [p1] [p2] [p3] [...];\r\n
         * Parameters:
         *  [x_pos] - horizontal position within the window
         *  [y_pos] - vertical position within the window
         *  [object_name] - name of the object (optional)
         *  [p1] [p2] [p3] [...] the parameters of the object (optional)
         */
        qi::uint_parser<int32_t, 16, 6, 6> hex6{};
        rgb = ('#' >> hex6) | qi::int_;

        auto boundary = qi::copy(!(qi::graph - ';'));
        auto opt = [](auto const& p) { return qi::copy(p | &qi::lit(';')); };

        bool_ = qi::bool_ | qi::uint_parser<bool, 2, 1, 1>{};

        vslider                             //
            = qi::lexeme["vsl" >> boundary] //
            >> opt(qi::uint_)               // width
            >> opt(qi::uint_)               // height
            >> opt(qi::double_)             // bottom
            >> opt(qi::double_)             // top
            >> opt(bool_)                   // log
            >> opt(bool_)                   // init
            >> opt(string)                  // send
            >> opt(string)                  // receive
            >> opt(string)                  // label
            >> opt(qi::int_)                // x_off
            >> opt(qi::int_)                // y_off
            >> opt(string)                  // font
            >> opt(qi::uint_)               // fontsize
            >> opt(rgb)                     // bg_color
            >> opt(rgb)                     // fg_colo
            >> opt(rgb)                     // label_color
            >> opt(qi::double_)             // default_value
            >> opt(bool_)                   // steady_on_click
            ;

        generic           //
            = opt(string) // name
            >> *string;   // params

        definition = vslider | generic;

        string = +('\\' >> qi::space | qi::graph - ";");
        object = "#X obj"           //
            >> qi::int_ >> qi::int_ //
            >> definition           //
            >> ";"                  //
            ;

        BOOST_SPIRIT_DEBUG_NODES(          //
            (start)(object)(string)(rgb)   //
            (definition)(vslider)(generic) //
            (bool_))                       //
    }

  private:
    using Skipper = qi::blank_type;
    qi::rule<Iterator, Ast::PdObject(),         Skipper> object;
    qi::rule<Iterator, Ast::Defs::Vslider(),    Skipper> vslider;
    qi::rule<Iterator, Ast::Defs::Generic(),    Skipper> generic;
    qi::rule<Iterator, Ast::Defs::Definition(), Skipper> definition;

    // lexemes
    qi::rule<Iterator, bool()>          bool_;
    qi::rule<Iterator, Ast::RGB()>      rgb;
    qi::rule<Iterator, Ast::String()>   string;
    qi::rule<Iterator, Ast::PdObject()> start;
};

int main()
{
    PdObjectGrammar<std::string::const_iterator> const parser;

    for (std::string const input :
         {
             "#X obj 55 50;",
             "#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;",
             "#X obj 50 38 vsl 15 128 0 127 0 0 empty empty empty 0 -8 0 8 -262144 -1 -1 0 1;",
         }) //
    {
        Ast::PdObject msg;

        auto f = input.begin(), l = input.end();
        std::cout << "Input: " << quoted(input) << std::endl;
        if (qi::parse(f, l, parser, msg)) {
            std::cout << " -> " << boost::core::demangle(msg.definition.type().name()) << std::endl;
            std::cout << " -> " << msg << std::endl;
        } else
            std::cout << " -> FAILED" << std::endl;

        if (f != l)
            std::cout << " Remaining: " << quoted(std::string(f, l)) << std::endl;
    }
}

Prints

Input: "#X obj 55 50;"
 -> Ast::Defs::Generic
 -> (55 50 ("" { }))
Input: "#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;"
 -> Ast::Defs::Generic
 -> (92 146 ("bng" { "20" "250" "50" "0" "empty" "empty" "empty" "0" "-10" "0" "12" "#fcfcfc" "#000000" "#000000" }))
Input: "#X obj 50 38 vsl 15 128 0 127 0 0 empty empty empty 0 -8 0 8 -262144 -1 -1 0 1;"
 -> Ast::Defs::Vslider
 -> (50 38 ( 15  128  0  127 0 "" "empty" "empty"  "empty" 0 -8  "0"  8  (-262144)  (-1)  (-1)  0  1))

Note how we parse bng as Generic by default, simply because we didn't add a definition rule for it yet. Adding it: Live:

Input: "#X obj 55 50;"
 -> Ast::Defs::Generic
 -> (55 50 ("" { }))
Input: "#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;"
 -> Ast::Defs::Bang
 -> (92 146 ( 20  250  2 "" "empty" "empty" "empty"  0  -10  "0"  12  (16579836)  (0)  (0)))
Input: "#X obj 50 38 vsl 15 128 0 127 0 0 empty empty empty 0 -8 0 8 -262144 -1 -1 0 1;"
 -> Ast::Defs::Vslider
 -> (50 38 ( 15  128  0  127 0 "" "empty" "empty"  "empty" 0 -8  "0"  8  (-262144)  (-1)  (-1)  0  1))

That was basically 1:1 copy-paste from the PureData grammar docs.

Of course, my fingers itch to remove the duplication of init, send, receive, label, x_off, y_off, font, fontsize, bg_color, fg_color and label_color... But I'll leave it as an exorcism for the reader.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • If I understand your post right, the "boundary" rule you put in your full demo prevents things like "vsla" getting matched by the "vsl" rule. Is that correct? I arrived at a similar implementation to your second example and found that it would parse "tgle" as though it were the "tgl" keyword, which is incorrect. Is the boundary rule how you're preventing that? – stix Apr 12 '23 at 18:39