13

We have inherited old code which we are converting to modern C++ to gain better type safety, abstraction, and other goodies. We have a number of structs with many optional members, for example:

struct Location {
    int area;
    QPoint coarse_position;
    int layer;
    QVector3D fine_position;
    QQuaternion rotation;
};

The important point is that all of the members are optional. At least one will be present in any given instance of Location, but not necessarily all. More combinations are possible than the original designer apparently found convenient to express with separate structs for each.

The structs are deserialized in this manner (pseudocode):

Location loc;
// Bitfield expressing whether each member is present in this instance
uchar flags = read_byte();
// If _area_ is present, read it from the stream, else it is filled with garbage
if (flags & area_is_present)
    loc.area = read_byte();
if (flags & coarse_position_present)
    loc.coarse_position = read_QPoint();
etc.

In the old code, these flags are stored in the struct permanently, and getter functions for each struct member test these flags at runtime to ensure the requested member is present in the given instance of Location.

We don't like this system of runtime checks. Requesting a member that isn't present is a serious logic error that we would like to find at compile time. This should be possible because whenever a Location is read, it is known which combination of member variables should be present.

At first, we thought of using std::optional:

struct Location {
    std::optional<int> area;
    std::optional<QPoint> coarse_location;
    // etc.
};

This solution modernizes the design flaw rather than fixing it.

We thought of using std::variant like this:

struct Location {
    struct Has_Area_and_Coarse {
        int area;
        QPoint coarse_location;
    };
    struct Has_Area_and_Coarse_and_Fine {
        int area;
        QPoint coarse_location;
        QVector3D fine_location;
    };
    // etc.
    std::variant<Has_Area_and_Coarse,
                 Has_Area_and_Coarse_and_Fine /*, etc.*/> data;
};

This solution makes illegal states impossible to represent, but doesn't scale well, when more than a few combinations of member variables are possible. Furthermore, we would not want to access by specifying Has_Area_and_Coarse, but by something closer to loc.fine_position.

Is there a standard solution to this problem that we haven't considered?

  • 4
    I feel like something is wrong with the design of this code. Class fields themselves can not be optional and even if all of them are wrapped somehow then all the code using such structs will be cluttered with optionality check. – user7860670 Mar 15 '18 at 06:33
  • 1
    It isn't quite clear why you perceive this as a design flaw. Can you show specific examples of problems that arise due to this design? – n. m. could be an AI Mar 15 '18 at 06:35
  • If you want to prevent requesting an uninitialized member, the easiest solution would just be to have a constructor that initializes all members. Is there a reason why one is not being used? If you want a more modern approach, tuples could also be used in conjunction with std::get to access members, which is rolled out at runtime. – Ryan Stonebraker Mar 15 '18 at 06:35
  • 1
    If you dont want to "Requesting a member that isn't present" you should define separate classes for each possible combination of fields (maybe using a class hierarchy if that makes sense), define the required operations in the base class ((pure) virtual) and implement them in the derived classes. then you don't need any runtime checks anymore. But the virtual function calls may be a bit slower than simply using optional. – Rene Mar 15 '18 at 06:36
  • 1
    this is very similar to how google protobuf v2 works, with its optional fields, and a bit map of deserialised fields, with defaults for others. – dgsomerton Mar 15 '18 at 06:37
  • Variant is probably your safest bet, but the design looks suspicious. – juanchopanza Mar 15 '18 at 06:40
  • Are you saying that any use of say location-with-area should be statically segregated from any use of location-without-area? – n. m. could be an AI Mar 15 '18 at 06:40
  • I cannot find any substantial difference between a design that uses std:::variant and that of std::optional. If you generate, perhaps automatically, all necessary variants, you get something equivalent to an std::optional design but more awkward to use. – n. m. could be an AI Mar 15 '18 at 06:48
  • @n.m. Advantages I can think of: if the variant is consumed via visitors, then if forces to consider all the types. Also, it allows to use distinct, better designed types, instead of having a kind of hybrid type. But it really depends on how the real use-case. – juanchopanza Mar 15 '18 at 06:49
  • @juanchopanza There are O(2^N) variants, not something I'd want to enforce... – n. m. could be an AI Mar 15 '18 at 07:27
  • If the structs are deserialized, how can it be possible to check it at compile time? If you want to remove redundant checks, either change the getters or write something like `__builtin_unreachable` – Passer By Mar 15 '18 at 07:35

4 Answers4

3

What about mixins?

struct QPoint {};
struct QVector3D {};
struct Area {
    int area;
};
struct CoarsePosition {
    QPoint coarse_position;
};
struct FinePosition {
    QVector3D fine_position;
};
template <class ...Bases>
struct Location : Bases... {
};

Location<Area, CoarsePosition> l1;
Location<Area, FinePosition> l2;
Yola
  • 18,496
  • 11
  • 65
  • 106
0

I'll first say that I've also occasionally wanted to have an "optionalization" of a class, where all members become optional. I'm thinking perhaps this could be possible without proper metaprogramming using code similar to Antony Polukhin's magic_get.

But be that as it may... You could have a partially-type-safe attribute map with arbitrary-typed values:

class Location {
    enum class Attribute { area, coarse_position, fine_position, layer };   
    std::unoredered_map<Attribute, std::any> attributes;
}

std::any can hold any type (something by allocating space on the stack, sometimes internally). Facing the outside the type is erased, but you can restore it with a get<T>() method. That's safe in the sense that you'll get an exception if you stored an object of one type and are trying to get() another type, but it's unsafe in that you won't get an error thrown in compile-time.

This can be adapted to the case of arbitrary attributes, beyond those you've originally planned, e.g.:

class Location {
    using AttributeCode = uint8_t;
    enum : AttributeCode { 
        area            = 12,
        coarse_position = 34,
        fine_position   = 56,
        layer           = 789
    };   
    std::unoredered_map<AttributeCode, std::any> attributes;
}

The use of the attributes could involve free functions which check for the presence of relevant attributes.

In practice, by the way, an std::vector would probably be faster to search than the std::unordered_map.

Caveat: This solution does not give you much of the type safety you desire.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
0

You could have a version of the structure that makes the bitmap compile time and checks it there. I assume that for a particular piece of code, you make assumptions about what is present. In that code you can take the version with the compile time bitmap. In order to successfully convert a run-time bit-mapped version to the compile-time bit-mapped, the bit map would be validated.

#include <stdexcept>

struct foo
{
    int a;
    float b;
    char c;
};

struct rt_foo : foo
{
    unsigned valid;
};

template <unsigned valid>
struct ct_foo : foo
{
    // cannnot default construct
    ct_foo () = delete;

    // cannot copy from version withouth validity flags
    ct_foo (foo const &) = delete;
    ct_foo & operator = (foo const &) = delete;

    // copying from self is ok
    ct_foo (ct_foo const &) = default;
    ct_foo & operator = (ct_foo const &) = default;

    // converting constructor and assignement verify the flags 
    ct_foo (rt_foo const & rtf) :
        foo (check (rtf))
    {
    }

    ct_foo & operator = (rt_foo const & rtf)
    {
        *static_cast <foo*> (this) = check (rtf);

        return *this;
    }

    // using a member that is not initialize will be a compile time error at when
    // instantiated, which will occur at the time of use

    auto & get_a () { static_assert (valid & 1); return a; }
    auto & get_b () { static_assert (valid & 2); return a; }
    auto & get_c () { static_assert (valid & 3); return a; }

    // helper to validate the runtime conversion

    static foo & check (rt_foo const & rtf)
    {
        if ((valid & rtf.valid) != 0)
            throw std::logic_error ("bad programmer!");
    }
};
nate
  • 1,771
  • 12
  • 17
0

If you always know at read or construction time what fields will be present then making the validity bit a template argument and checking with static_assert would work.

#include <stdexcept>
#include <iostream>

struct stream {
    template <typename value> value read ();
    template <typename value> void read (value &);
};

template <unsigned valid>
struct foo
{
    int a;
    float b;
    char c;

    auto & get_a () { static_assert (valid & 1); return a; }
    auto & get_b () { static_assert (valid & 2); return b; }
    auto & get_c () { static_assert (valid & 4); return c; }
};

template <unsigned valid>
foo <valid> read_foo (stream & Stream)
{
    if (Stream.read <unsigned> () != valid)
        throw std::runtime_error ("unexpected input");

    foo <valid> Foo;

    if (valid & 1) Stream.read (Foo.a);
    if (valid & 2) Stream.read (Foo.b);
    if (valid & 4) Stream.read (Foo.c);
}

void do_something (stream & Stream)
{
    auto Foo = read_foo <3> (Stream);

    std::cout << Foo.get_a () << ", " << Foo.get_b () << "\n";

    // don't touch c cause it will fail here
    // Foo.get_c ();
}

This also allows for templates to deal with missing fields using if constexpr.

template <unsigned valid>
void print_foo (std::ostream & os, foo <valid> const & Foo)
{
    if constexpr (valid & 1)
        os << "a = " << Foo.get_a () << "\n";
    if constexpr (valid & 2)
        os << "b = " << Foo.get_b () << "\n";
    if constexpr (valid & 4)
        os << "c = " << Foo.get_c () << "\n";
}
nate
  • 1,771
  • 12
  • 17