4

I have the typical reentrant C style parser, where the parsed data is contained in an union like the following one:

%union {
    int number;
    const char *string;
    Item *item_ptr;
}

I would like to use Shared Pointers instead of normal pointers.

I cannot use std::shared_ptr because I cannot compile the source code with C++11, I am also forbidden to use boost::shared_ptr. Thus, I have my own class SharedPtr, implementing the desired behaviour.

Unfortunately, I cannot plug the SharedPtr class within the union as follows:

%union {
    int number;
    const char *string;
    SharedPtr<Item> item_ptr;
}

because I get the following error:

bisonparser.yy:92:20: error: member ‘SharedPtr<Item> YYSTYPE::item_ptr’ with constructor not allowed in union
bisonparser.yy:92:20: error: member ‘SharedPtr<Item> YYSTYPE::item_ptr’ with destructor not allowed in union
bisonparser.yy:92:20: error: member ‘SharedPtr<Item> YYSTYPE::item_ptr’ with copy assignment operator not allowed in union
bisonparser.yy:92:20: note: unrestricted unions only available with -std=c++11 or -std=gnu++11

An alternative could be inserting a level of indirection as follows:

%union {
    int number;
    const char *string;
    SharedPtr<Item> *item_ptr;
}

However, I wonder if there is a much cleaner way to design my project so that I can use my SharedPtr class directly instead of as a pointer. What are the minimal changes that I have too look for to get to the alternative solution?

Anon
  • 43
  • 3
  • Before C++11: "Unions cannot contain a non-static data member with a non-trivial special member function (copy constructor, copy-assignment operator, or destructor)." https://en.cppreference.com/w/cpp/language/union – JMAA Sep 15 '19 at 12:39
  • More broadly, it might be advisable to just avoid unions altogether here and use a different design. I assume you don't really need the space savings from a union? – JMAA Sep 15 '19 at 12:42
  • flex apparantly has a `--c++` option that claims to generate a "C++ scanner class". Try it, see how that declares the token union, and go from there. – Sam Varshavchik Sep 15 '19 at 12:42
  • What is the `%union` notation that you are using? What is the purpose of the `%`? – Richard Chambers Sep 15 '19 at 12:47
  • 1
    @RichardChambers https://www.gnu.org/software/bison/manual/html_node/Union-Decl.html – Ted Lyngmo Sep 15 '19 at 12:48
  • What if the only thing in the `%union` is a pointer to a variant class and the variant class handles all the various data types? Looking at the doc supplied by @TedLyngmo it appears from a cursory reading that the Bison parser uses a `union` as a fundamental datatype container for its processing. I'm not sure that anything other than a `union` can be used and without C++11, constructors and destructors are not allowed in a `union`. How is this `%union` data structure used in the Bison generated parser? – Richard Chambers Sep 15 '19 at 13:02
  • @JMAA correct, I wouldn't mind using a struct or a class. I am trying to learn the minimal set of changes that I have to apply to get there. – Anon Sep 15 '19 at 13:02
  • @RichardChambers That's also an option, but i'd have to *new/delete* the intermediate node for each sub-tree, which may be a waste of resources. – Anon Sep 15 '19 at 13:06
  • I'm a bit confused. If you are doing a shared pointer class then I assume you already have to deal with `new` and `delete` as a part of managing the shared pointer contents. – Richard Chambers Sep 15 '19 at 13:09
  • @RichardChambers Ideally, I'd like to replace the union with something on the stack/heap managed by flex rather something on the heap managed by me. If I can't get there, then using `SharedPtr *item_ptr` or wrapping `SharedPtr` in a class as you suggest is fine too. – Anon Sep 15 '19 at 13:15
  • What I am suggesting is to wrap the `union`, which is a rough kind of variant memory area without any container intelligence, into a variant class that does have container intelligence and place a pointer to such a class into the `%union`. Then all operations would go through the variant class pointer. So what is currently in the `%union` is put into the new variant class instead. However I do not know how the `%union` is used within Bison. It appears to be used to specify a `union` in the source generated by Bison. – Richard Chambers Sep 15 '19 at 13:46
  • @RichardChambers **i.e.** `%union { Node * ptr };`, where `class Node { public: SharedPtr item_ptr };`. Unfortunately, it gets a bit clunky when using it. Instead of `$$ = new Item()` i get `$$ = new Node(); $$->item_ptr.reset(new Item())`. And I have to manually delete `Node` everywhere, with chances of memory leaks quite like before. That is why I am not quite happy with the design. – Anon Sep 15 '19 at 13:57

1 Answers1

4

The basic problem is that bison's C interface uses unions heavily (from %union) and C unions are pretty incompatible with C++ (pre-C++11 you can't use them at all with non-trivial types and even post C++11 they are hard to use safely).

On possibility is to use Bison's C++ mode, but that is a fairly verbose and wide-ranging change. Alternately, you can (carefully) use raw pointers and other types that are safe to put in a union. You'll need to be very careful to avoid memory leaks however (and use bison's %destructor to avoid leaks with syntax errors)

Another possibility is to not use %union at all -- instead use #define YYSTYPE SharedPtr<Item> to make the stack value a single shared pointer that you'll use everywhere in the code. You need to have your Item type be a base class with all your other types deriving from it, using virtual functions as appropriate.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • Would `#define YYSTYPE Container`, where `Container` is a class containing all the members previously stored in the union, including `SharedPtr`, also work? I don't know if `$$` is copied by value at any point in time, or just passed around as a pointer/by reference. – Anon Sep 15 '19 at 20:28
  • 1
    @Anon: the parser stack is an array of `YYSTYPE`, abd its elements are copied pretty often. Moreover, it's allocated dynamically with `malloc` and not initialised, so the assumption is that you can copy-assign over uninitialised storage. There is an undocumented hook which lets you substitute your own allocator, so you could make a `struct` of non-trivial objects work. But there would be a lot of overhead pointlessly constructing unused members. Also, `$$` is an automatic temporary which is initialised (by copy) to `$1`; at the end of the action it is pushed onto the stack (also copy). – rici Sep 15 '19 at 22:00
  • @rici I see. Thanks to the both of you, this Q/A was very informative for me! – Anon Sep 15 '19 at 22:21
  • @rici: Actually it depends on which version of yacc/bison you are using -- some (most recent ones?) use `new[]` rather than `malloc` to allocate the parser stack when compiled with a C++ compiler... – Chris Dodd Nov 01 '19 at 20:13
  • @ChrisDodd: Are you sure? I tried it with bison 3.4.2 and I see no evidence of an invocation of `new[]` to expand the stack. The initial stack (in both C and C++) is an automatic array of size `YYINITDEPTH`, but if you exceed that size then, afaics, a new allocation is performed with `malloc` and the existing stack is copied into the new allocation. (Or maybe not. Depending on how you define the semantic type, it's possible that bison will simply refuse to expand the stack.) – rici Nov 02 '19 at 22:18