C++ Double Dispatch, Factory Pattern or a way to automate creating derived objects from received serial data

Question

I'm working on a C++ communication library in which I receive serialized data from a number of devices (network sockets, uart/usb, CAN, and LIN networks). I also need to create serialized data from my message objects.

I have a base class called MessageBase from which I presently have two derived classes called Message and CtrlMessage. The project will eventually need a few more message types in the future and so I'm looking to implement using a design pattern that allows easy to expand to new message types in the future.

My other goal is, as Scott Meyes puts it, hard to use the classes incorrectly and easy to use correctly.

I began looking at using NVI pattern and using C++ factory to create messages however, the Factory class would then need to handle some of the de-serialization of a header in order to figure out what Type of message is in the payload.

class MessageBase
{
private:
  // other public & private methods omitted for brevity
  MessageBase &ISerialize( dsStream<byte> &sdata) = 0;
public:
  MessageBase &Serialize( dsStream<byte> &sdata)
  {
     ISerialize(sdata);
  }

}

class Message : public MessageBase
{
private:
    // other public & private methods omitted for brevity
   MessageBase &ISerialize( dsStream<byte> &sdata);
public:
}

class MessageFactory
{
private:

public:

   CreateMessageFromStream( dsStream<byte> &RxData)
   {
      // read N bytes from RxData to determine type and then
      // jump into switch to build message

      switch(MsgType)
      {
         case MSG_DATA:
         {
           Message *pMsg = new Message(RxData);
         }
         break;

         case MSG_CTRL:
         {
           MessageCtrl *pMsg = new MessageCtrl(RxData);
         }
         break;
      }
   }

// I shorten quite a bit of this to, hopefully, give the basic idea.

The other approach I've been studying is the Double Dispatch as outlined by Scott Meyers Item#33 in his More Effective C++ book. But that only appears to shift the problem to either requiring all the sibling derived classes to know about each other or the more advanced solution with stl map to emulate vtable. That code looks awful and hard to follow.

I took a look at the C++ visitor pattern and the Builder Creational pattern and these all require the caller to know what kind of derived Message type you want to instantiate ahead of time.

I know I could just put a big switch statement in the MessageFactory, like shown, and be done with it but what I'm after is a way to add new message types derived from MessageBase and not have to touch the MessageFactory class. I don't want other programmers to have to know or go find all the places where code needs to be updated for a new message type.

Also, this is an embedded application and that being the case certain things are off the table. I can use Template programming techniques but I do not have any STL library nor do I have any Boost library support.

Any suggestions?

How many message types are we talking about? Are all the message types small integers? How likely is it a message type will be added 5 years down the road? — brian beuning, Feb 02 '14 at 04:16
Crtp intermediate helper that registers each class with a central factory? Or, type list of derived centrally managed that factory uses? Or each message has an id, and there is a factory function overload set taking the distinct types generated from the id where each type provides an overload, and "magic switch" turns the run time into a compile time value? Enum of message types with a terminating '`num_msgs`' and a global array of factories from `enum` to factory? — Yakk - Adam Nevraumont, Feb 02 '14 at 04:17
You want a map from MsgType to message creation function. You **must** populate this map "by hand" (that is, register every serialisable class) when the program starts. There is no way to do it automatically. If you don't have std::map, create a simple one yourself — n. m. could be an AI, Feb 02 '14 at 04:24
@brian Beuning: Currently just 2 types but I'm aware that 2 more will be needed in the next year. In next 5 years, I would say it's reasonable to need 3 more types. To expand: This is an embedded computer that goes in a vehicle and various automotive networks and message types will flow into the hardware. I'm trying to encapsulate this messaging. — Eric, Feb 02 '14 at 05:16
@Yakk: each message type is centrally added to a ENUM. I don't quite follow the rest of your suggestion. Perhaps some sample code? — Eric, Feb 02 '14 at 05:17
@n.m. I have a map from MsgType to message creation function however it requires developers creating new message types to know to go update the map table. Maybe you're referring to the Scott Meyers technique I mentioned in my OP; That code is hard to follow IMHO. Currently my "map" technique is a switch statement in the factory class but I'm trying to find a more elegant way like the double dispatch but without having siblings know about each other.... — Eric, Feb 02 '14 at 05:20
No, double dispatch will not help you here. It works when you need to dispatch on two polymorphic objects at once, e.g. intersect two shapes. You don't have even one object, your goal is to create one. — n. m. could be an AI, Feb 02 '14 at 05:49
You could generate the switch using a variadic template function, but this would still require new message types to be added to the template specialization. I do not see a way out of forcing new message types to be registered somewhere at compile time. This means some code somewhere must be changed to make the factory aware of the new type. — hifier, Feb 02 '14 at 08:27
@N.M. Thank you for clarifying the double dispatch pattern. Your example of two shapes intersecting really help to clarify, thank you! — Eric, Feb 02 '14 at 19:10
@hifier I was sort thinking the same as you but I figured coming here and asking might reveal some elegant tricks from folks who are more knowledgeable or experienced than I am and might offer a better way inline with the make it hard for others to use wrong and easy to use correct mantra. I guess, really, I'm trying to validate my approach with others, like yourself, in this field. — Eric, Feb 02 '14 at 19:11
@Eric - I didn't mean to imply that it's a bad question. I think this discussion is useful. — hifier, Feb 04 '14 at 05:13

score 1 · Accepted Answer · answered Feb 03 '14 at 20:12

I do not know if this is worth it. But machinery to do some of what you want can be written.

We start with MessageBase. It has a private constructor.

You then tell it to make MessageHelper<T> to be a friend class.

MessageHelper looks like this:

enum MessageType {
  TYPE1, // notice no assigment
  TYPE2, // values should be consecutive, distinct, and start at `0`
  TYPE3, // or things go poorly later on.
  NUM_TYPES /* should be last */
};
template<MessageType> struct MessageTag {}; // empty, for overloading
template<MessageType...> struct MessageTags {};
template<MessageType Last, MessageType... List> struct MakeMessageTags:
  MakeMessageTags<MessageType(Last-1), MessageType(Last-1), List...>
{};
template<MessageType... List> struct MakeMessageTags<MessageType(0), List...>:
  MessageTags<List...>
{};
typedef MessageBase*(*MessageCreatorFunc)(dsStream<byte>&);

// write this somewhere, next to a given type.  If you don't, code later will fail to compile
// (yay).  You could make a macro to write these:
MessageCreatorFunc MessageCreator( MessageTag<TYPE1> ) { 
  return []( dsStream<byte>& st )->MessageBase* {
    return new MessageType1(st);
  };
}

// manual compile time switch:    
template<MessageType... List>
MessageBase* CreateMessageFromStream_helper( MessageType idx, dsStream<byte>& st, MessageTags<List...> )
{
  static MessageCreatorFunc creator[] = { MessageCreator(MessageTag<List>())... };
  return creator[idx]( st );
}
MessageBase* CreateMessageFromStream( dsStream<byte>& st ) {
  // stuff, extract MessageType type
  MessageBase* msg = CreateMessageFromStream_helper( type, st, MakeMessageTags<MessageType::NUM_TYPES>() );
  // continue
}

the effect of the above code is that we automatically build a manual jump table to create our messages.

If nobody writes the MessageCreator( MessageTag<TYPE> ) overload, or it is not visible in the context of the _helper, the above fails to compile. So this ensures that if you add a new message type, you write the creation code or you break the build. Much better than a switch statement hiding somewhere.

At some spot there must be an association between MessageType and the C++ type that should be created: the above machinery just makes sure if that association is not set up, we get a compiler error.

You can have a bit more fun and get a better message by, instead of overloading on MessageCreator, you specialize:

template<MessageType TYPE>
void MessageCreator( MessageTag<TYPE> ) {
  static_assert( "You have failed to create a MessageCreator for a type" );
}
// specialization:
template<>
MessageCreatorFunc MessageCreator( MessageTag<TYPE1> ) {
  return []( dsStream<byte>& st )->MessageBase* {
    return new MessageType1(st);
  };
}

which is a bit more obtuse, but might generate a better error message. (while template<> is not required for all cases, as the override also replaces the template, by the standard at least one such specialization which can compile must exist, or the program is ill formed, no diagnosis required (!?)).

This is along the lines of what I was thinking, but after looking at the result it makes me wonder if it is really worth it. I mean there are a million and one ways to break a program, if you're adding a message type you might want to write some tests that ensure it is serialized/deserialized properly. This compile time check does not add much value and greatly obfuscates the logic IMO. There is something to be said for the simplicity of a switch statement. Also, this kind of thing is FAR easier in languages like Ruby and Javascript - not that they are viable for embedded systems, of course. — hifier, Feb 04 '14 at 05:12
@hifier I completely agree. There is something to be said for the simplicity of a switch statement. Marking Yakk's answer as accepted since s/he basically exemplified the problem and provided, IMHO, a good example of what not to do. Anyone trying to maintain the code after me will never, easily, follow what's going on here IMHO. — Eric, Feb 09 '14 at 20:38
@Eric Note that, when `enum` reflection arrives, things get less obtuse than the above: lots of restrictions go away (like consecutive, and having that `NUM_TYPES` value), and some boilerplate evaporates. I have used variations of the above in a few spots where I'm building enumerated dispatch tables (say, for pixel ops), where the count is in the enum, and other code asserts that the (C-style array) table is fully populated. I wrote the checks originally as a lark, but as it happens I actually found a missing entry in one of the tables that nobody had noticed! — Yakk - Adam Nevraumont, Feb 10 '14 at 01:47

C++ Double Dispatch, Factory Pattern or a way to automate creating derived objects from received serial data

1 Answers1