What is idiomatic modern C++ for algebraic data types?

Question

Suppose, for example, you want to implement a spreadsheet Cell in C++. A cell can be either a string, a number, or perhaps empty. Ignore other cases, like it being a formula.

In Haskell, you might do something like:

data Cell = CellStr String | CellDbl Double | None

What is considered the current "best practice" for doing it in C++? Use a union in a structure with a type indicator, or something else?

One possible option is [`boost::variant`](http://www.boost.org/doc/libs/1_60_0/doc/html/variant.html). — Pixelchemist, Mar 29 '16 at 11:55
I would go with a sorted `vector> doubles;` and a sorted `vector> strings;`. For a given cell coordinate you `lower_bound` into the `doubles`, if you didn't find it you do the same for the `strings`, otherwise it is `None`. Drawing the screen should be very fast, you just iterate through the `vector`s. Calculations are a bit messy, because they depend on the type, but you can probably abstract that away. Effectively I just cheated and never combined different types into one. Anyway, the question is too broad and opinionated. — nwp, Mar 29 '16 at 12:26
@MvG Unfortunately, the highlighting code for haskell is `lang-hs` instead of `lang-haskell`. Keep this in mind the next time you want to add highlighting of Haskell code. — Bakuriu, Mar 29 '16 at 15:07
@Bakuriu: Thanks, and sorry for the mistake. I thought that the fact that it changed highlighting to something other than the C++ default was indication enough that I had it right, although it did look a bit strange. One more reason why [having a UI for this](http://meta.stackoverflow.com/q/254432/1468366) would be a good thing… — MvG, Mar 29 '16 at 15:14
Either tagged unions (possibly templated) or via a polymorphic base class: http://stackoverflow.com/a/35838980/1116364 for sketches of both approaches. — Daniel Jour, Mar 29 '16 at 22:57
I can roll out a manual variant for you if you want. Then you don't have to use variant — Czipperz, Apr 03 '16 at 19:45
Link (as question has been closed): https://gist.github.com/czipperz/ca36868273d193b48ec7edcc84051e6e — Czipperz, Apr 03 '16 at 20:22

score 24 · Answer 1 · answered Mar 29 '16 at 12:25

24

struct empty_type {};
using cell_type = boost::variant<std::string, double, empty_type>;

Then you would do something with the cell with:

boost::apply_visitor(some_visitor(), cell);

answered Mar 29 '16 at 12:25

Richard Hodges

68,278
7
90
142

2

Also note that there is a proposal for standardising [`std::variant`](http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0088r1.html) (original proposal [here](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4218.pdf)) – filipos Mar 29 '16 at 12:30
@filipos in my view the proposal is flawed since it seeks to mandate allowing a variant to be empty. I sincerely hope it's rejected in favour of one that models the boost variant more closely. – Richard Hodges Mar 29 '16 at 12:33
2

The latest version does not mandate that. To use an empty state, you explicitly add the type `monostate` to the type list. It is true though that a variant can become invalid (not empty) under exceptional conditions. – filipos Mar 29 '16 at 12:41
3

@filipos it seems to me that they're mixing concerns. optional is one concern, variant is another. If the proposer wants an optional variant, he can use optional>. A variant should never be allowed to be invalid, even after a move - it should simply contain a moved-from T. – Richard Hodges Mar 29 '16 at 12:44
4

"LEWG opted against introducing an explicit additional variant state, representing its invalid (and possibly empty, default constructed) state." – filipos Mar 29 '16 at 12:46
@filipos I read your blog article, and boost::variant does seem to be the best fit to the problem. Compile-time checking is good. I have resisted using Boost in the past. I think Boost will be a difficult sell for an open source project that I am interested in contributing to, though. Hmmm, something for me to think about, though. – blippy Mar 29 '16 at 13:40
@RichardHodges How would you deal with the case of assignments to the variant in the case where the copy constructor for T throws an exception? – Cort Ammon Mar 29 '16 at 14:38
@CortAmmon don't we already have the copy/swap idiom for types that can throw? – Richard Hodges Mar 29 '16 at 14:45
@RichardHodges Yes, using extra allocations or space. From my understanding of variant, much of its value is in its performance because it doesn't need to allocate new memory and doesn't waste any more space than it has to. That's what I thought separated it from a trivial-to-implement visitor pattern. – Cort Ammon Mar 29 '16 at 14:51
@CortAmmon copy/swap does not need to allocate any space. The copy is an auto variable and the data is moved/swapped. Since c++11 it's extremely efficient. The tiny cost of the redundant move is outweighed by the guarantee of logical correctness baked in at compile time... or have we learned nothing in the last 20 years? – Richard Hodges Mar 29 '16 at 15:48
@RichardHodges I think, in the last 20 years, we've learned that some problems are interesting enough to call for new idioms. Copy/swap cannot work here because variant is intended to operate like a union container, not a struct. The variant has no member of type T to swap with. The existing value of the variant must be deconstructed and a new value emplaced in the same memory space (via a copy or move constructor). Thus, by the time the exception is thrown, the old value is destroyed. – Cort Ammon Mar 29 '16 at 16:08
@RichardHodges Remember that, in principle, even a move constructor might throw, but, otherwise, you can get the strong exception guarantee via `void safe_assign(auto& y, auto x) { y = move(x); }`. – filipos Mar 29 '16 at 17:09
@CortAmmon I see the problem. how do you swap an X with a Y? This is solvable with 2 temporaries and by making the swap a 2-phase process: copy A to temp1, move B to temp2, destruct B, move-construct B with temp1, delete temp1 and temp2. If B's move-constructor throws, catch and move temp2 back to A. Still no need for a zombie state. – Richard Hodges Mar 29 '16 at 19:21
@RichardHodges and if the move of temp2 back to A throws? – Cort Ammon Mar 29 '16 at 20:07
@CortAmmon fair enough. you've got me :) – Richard Hodges Mar 29 '16 at 20:10
⁺¹; it's worth noting that std∷variant [been implemented for C++17](http://en.cppreference.com/w/cpp/utility/variant). – Hi-Angel Dec 04 '16 at 09:13

score 5 · Answer 2 · edited Mar 29 '16 at 17:29

5

Inheritance?

I have to say that I do not really like this method and would not consider it modern, but it still seems to be standard.

class DoubleCell : public Cell {
    double value;

    public:
    DoubleCell( double v ) : value(v) {}
    double DoubleValue() { return value; }
    ...
};

class StringCell : public Cell {
    std::string value;

    public:
    StringCell( std::string v ) : value(v) {}
    std::string StringValue() { return value; }
    ...
};

class EmptyCell : public Cell {
    ...
};

Some of the drawbacks are:

When getting the actual value, you need to use different functions. This will usually involve using instanceof and casting.
Different objects cannot directly be put into a container, only as pointers.

edited Mar 29 '16 at 17:29

x4u

13,877
6
48
58

answered Mar 29 '16 at 11:56

Frank Puffer

8,135
2
20
45

1

This only partly answers the question. How would you get a value from such a cell `??? getValue(){return value;}` – 463035818_is_not_an_ai Mar 29 '16 at 11:58
I don't think that will work, because you couldn't, for example, have an array of cells. – blippy Mar 29 '16 at 11:59
@blippy: Yes, but you can have an array of (smart) pointers to cells. – Frank Puffer Mar 29 '16 at 12:00
2

Pointer semantics, dynamic memory allocations and virtual function calls for every single cell doesn't seem like a good idea to me. – nwp Mar 29 '16 at 12:01
With templates you cannot have an array or vector of cells since they will separate types. – NathanOliver Mar 29 '16 at 12:10
2

Templates won't really work (at least when implemented like in the example) because the type of each cell would need to be known at compile-time. I guess you can combine both approaches by making the template derived from a common base class, but it would have the same performance overhead as the first method then. – interjay Mar 29 '16 at 12:15
@NathanOliver: Right, have removed the template example. – Frank Puffer Mar 29 '16 at 12:24
@FrankPuffer I don't want to sound rude but you might want to consider just deleting the whole answer. – NathanOliver Mar 29 '16 at 12:29
@NathanOliver: Yes I will probably do so, but still it is the only answer so far that does not use additional libraries. That's what made me hesitate so far. – Frank Puffer Mar 29 '16 at 12:37
2

@FrankPuffer The use of the library is just so they do not have to roll their own. In your example how would you get the value from the cell? Until you get that this really is only half an answer. – NathanOliver Mar 29 '16 at 12:39
3

I don't think the answer should be deleted. It is a sensible design solution, and is worthy of discussion, even if it is only to say that there are better solutions. – blippy Mar 29 '16 at 12:42
Relevant article: https://akrzemi1.wordpress.com/2016/02/27/another-polymorphism/ – filipos Mar 29 '16 at 12:57
Is it legal to construct a union of base & derived class objects, and use runtime polymorphism on the base class member? You'd get a variant-style object but with the implementation-hiding of normal runtime polymorphism. Sadly I suspect the answer is "no" because having virtual functions means the types aren't standard-layout. – Useless Mar 29 '16 at 16:06

What is idiomatic modern C++ for algebraic data types?

2 Answers2