2

I'm creating a discriminated union-like class. I am using c++ 17, so I could technically use a std::variant, but because of the specific use-case I want the meaning of each variant to be more explicit (especially because two of the cases hold no data except which case they are). The class looks something like this (for the purposes of simplicity, I'll ignore move semantics in the question):

class MyC {

  public:
  enum class Kind {A, B, C, D};

  private:
  Kind _kind;
  union {

    struct {} _noVal;
    string _aVal;
    int _bVal; 

  };

  MyC(Kind kind) : _kind(kind), _noVal() {}

  public:
  MyC(const MyC& other) : _kind(other.kind), _noVal() {
    if (_kind == Kind::A) new (&_aVal) string(other._aVal);
    if (_kind == Kind::B) _bVal = other._bVal;
  }

  ~MyC() {
    if (_kind == Kind::A) _aVal.~string();
  }

  MyC& operator =(const MyC&);

  // factory methods and methods for consuming the current value

}

My first thought for the copy assignment operator is

MyC& MyC::operator &(const MyC& other) {
  this->~MyC();
  _kind = other._kind;
  if (_kind == Kind::A) new (&_aVal) string(other.aVal);
  else if (_kind == Kind::B) _bVal = other.bVal;
  else _noVal = other.noVal;
  return *this;
}

This seems fine to me, but I'm wondering if it's better c++ style to call string's copy assignment operator, which would require something more like this:

MyC& MyC::operator &(const MyC& other) {
  if (other._kind == Kind::A) {
    if (_kind != Kind::A) new (&_aVal) string; // *
    _aVal = other.aVal;
  } else if (other._kind == Kind::B) {
    _bVal = other.bVal;
  } else {
    _noVal = other.noVal;
  }
  _kind = other._kind;
  return *this;
}

To summarize, what is the right way to do this (and why), or does it matter?


* This line is here because my original implementation set aVal directly without making sure a string had ever been initialized there, and it crashed.

Anonymous
  • 491
  • 2
  • 12
  • How about using existing tested code and storing a `std::variant` in your class in-place of the `union`? You can keep the additional behaviour/requirements just implement less code. – Richard Critten Nov 06 '19 at 20:49
  • I could, but I'm under the impression `std::variant` has more overhead, and since my class is really fairly simple it's not much of a problem unless there's some really big, complicated issue I really don't want to do fix on my own (as with `std::shared_ptr`). – Anonymous Nov 06 '19 at 20:58
  • I strongly suggest you to use the `std::variant`; the only memory overhead it has is a single `size_t` (I believe) for the type index, which is not much. Those low-level classes like `std::variant` can be very hard to get right, especially if you want to have proper exception safety. – HolyBlackCat Nov 06 '19 at 21:35

1 Answers1

0

Your first version makes me nervous because you never actually call a constructor on this after calling this->~MyC(), so your object should be considered invalid. Even if the code immediately after forces all the current member variables into theoretical validity, it'll be a prone to bugs.

Your second option seems better, but it still leaves three places that must update every time someone adds to that union... and given that your Kind enum has 4 values while you're only showing three... I'm thinking that'll happen sooner or later. Again, prone to bugs.

I can think of two other options:

1)

MyC& operator =(const MyC& other)
{
    this->~MyC();
    new (this) MyC(other);
}

Use your existing copy constructor in place, after using your destructor "in place" to avoid leaks. Slightly hacky, much cleaner, no extra maintenance.

This results in you having only two places where you need to make changes, at the cost of a little efficiency in the string to string case. I find that maintainability trumps efficiency in most cases. Which cases? Ask your profiler.

WARNING: As Chris pointed out in the comments below, This Will Fail Horribly when dealing with a class derived from MyC unless that class likewise has its own assignment operator. If it were called on some derived class, that class would

  • leak anything that required that the derived class's destructor was called.
  • turn the derived class into an instance of MyC. MOSTLY. I suspect that the derived destructor would still be called, with much potential badness to follow.
    • OTOH, this could just clean up all the stuff that 'leaked' when we called the wrong destructor earlier. It really Just Depends.

On the one hand, this code is an excellent example of walking a high rope with no net. On the other, so is working with new/delete directly. We just have more practice walking that particular rope, and we have nets like unique_ptr and shared_ptr we can (and generally should) use.

2)

Wrap your union in some any/variant template (std::variant, boost::any, whatever), per Richard's comment. My last employer had something like 3 different variant/any classes written by three different programmers, at least two of whom clearly didn't bother to look around our codebase first. Your company might have its own implementation[s] too.

Don't assume that they'll be less efficient. Again: ask your profiler.

unrelated: Searching said former employer's codebase for interesting phrases like "do not check in", profanity, etc, found all kinds of interesting things. It was a game company, so things were considerably less "retentive" than one might find elsewhere. It made for entertaining reading.

Mark Storer
  • 15,672
  • 3
  • 42
  • 80
  • OPTION 1 is anything but clean. It is a terrible antipattern that should virtually never be used. 1) If the constructor throws, you're left with a dead object. 2) Even if the constructor succeeds, it's not guaranteed that myC is the proper type to construct in its place! Consider that myC is a base class, and _this_ actually points to a derived type. This approach will, for example, replace a ```Dog``` with an ```Animal``` and anyone still thinking it's a Dog are in serious trouble. – Chris Uzdavinis Nov 06 '19 at 21:11
  • An excellent point, though to be fair the destructor isn't virtual leading me to believe that won't be a problem. The problem could then be resolved by ensuring that all child classes implement their own assignment operator, which is just prone to a different kind of bug. I'm not sure even RTTI could dig you out hole number one. – Mark Storer Nov 06 '19 at 21:17
  • 1
    Also fun if in-place copy-ctor throws. – Deduplicator Nov 06 '19 at 21:49