Elegant way to implement extensible factories in C++

Question

I am looking for an intuitive and extensible way to implement factories for subclasses of a given base class in c++. I want to provide such a factory function in a library.The tricky part is that I want said factory to work for user-defined subclasses as well (e.g. having the library's factory function build different subclasses depending on what modules are linked to it). The goal is to have minimal burden/confusion for downstream developers to use the factories.

An example of what I want to do is: given a std::istream, construct and return an object of whatever subclass matches the content, or a null pointer if no matches are found. The global factory would have a signature like:

Base* Factory(std::istream &is){ ... };

I am familiar with prototype factories, but I prefer to avoid the need to make/store prototype objects. A related question is posted here for java: Allowing maximal flexibly/extensibility using a factory.

I am not looking for c++11-specific solutions at the moment, but if they are more elegant I would be happy to learn about those.

I came up with one working solution which I believe is fairly elegant, which I will post as an answer. I can imagine this problem to be fairly common, so I am wondering if anyone knows of better approaches.

EDIT: it seems some clarification is in order...

The idea is for the factory to construct an object of a derived class, without containing the logic to decide which one. To make matters worse, the factory method will end up as part of a library and derived classes may be defined in plugins.

Derived classes must be able to decide for themselves whether or not they are fit for construction, based on the input provided (for example an input file). This decision can be implemented as a predicate that can be used by the factory, as was suggested by several people (great suggestion, by the way!).

Why do you want these classes to derive from a given base class? Why not have a template and allow any class? That is, having `Factory`. — Aleph, Jun 29 '13 at 10:09
@AnotherTest I have no requirements to allow any class for my specific problem, so I prefer to enforce some structure in the code by limiting the factory to produce subclasses. — Marc Claesen, Jun 29 '13 at 10:12
Ah, I see, but often it's more easy to generalize something all the way through. I think having `Factory` would simplify your question. You could even make the parameters for your Factory variable (using variadic templates ;)) so you have to write just one class for this factory thing. — Aleph, Jun 29 '13 at 10:15
The main issue is that I want one interface to create `Base` objects, whatever the derived type may be. Using a template would not solve anything for me, since I can't say which type needs to be produced in advance (e.g. the template parameter is unknown and no prototype is available). — Marc Claesen, Jun 29 '13 at 10:18
You could, that's a second template parameter then. You could as well always have it produce `Base*` for any given type (=template). — Aleph, Jun 29 '13 at 10:19
I think you misunderstand my question. I don't want a conversion function to `Base`. I want to produce a subclass of base, *without knowing which one*, based on input (for example an `istream`). Hence templates do not help. — Marc Claesen, Jun 29 '13 at 12:56
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/32630/discussion-between-marc-claesen-and-anothertest) — Marc Claesen, Jun 30 '13 at 09:10
Could you perhaps create a factory that would create those factories? — Bartek Banachewicz, Jul 01 '13 at 13:15
I fail to see why you have thrown in this complication of the user types HAVE to derive from your own base. You can just have a templated factory that a user can use for their own types. If they wanted to use this factory for many types, they could use a boost::variant as the type for the factory. — thecoshman, Jul 01 '13 at 13:21
@BartekBanachewicz I don't know exactly how this would work, could you please provide a short example? -- The user types must subclass my own base because they are used as base objects later on. — Marc Claesen, Jul 01 '13 at 13:24
Well, my point was pretty much that there's no such thing that has both *factory* and *elegant* words in it. — Bartek Banachewicz, Jul 01 '13 at 14:32
I believe you should have a look at "API Design for C++" by Martin Reddy, section 3.3. — Korchkidu, Jul 01 '13 at 16:04
You could have a look at my answer here: http://stackoverflow.com/questions/333400/how-to-design-a-simple-c-object-factory/537165#537165 — Emiliano, Jul 01 '13 at 16:49
Do you need somehow parse the content of istream? Is it possible to extract a part o which allow you to recognize derived type? — Arpegius, Jul 02 '13 at 10:04
@Arpegius Yes, the istream must be parsed, but using logic developed in plugins. Ideally the istream's format can remain completely unconstrained, e.g. there is no specific part that contains the information to recognize the derived type. That is why I find this a tough nut to crack. The factory itself cannot contain the logic to do this. — Marc Claesen, Jul 02 '13 at 10:16
So you need to dynamically construct a parser within plugins which is directly connected to a factories, or probably a builder. The question is, are you going to write your own parser or libraries like Boost::Xpressive are considerable? What if a input will match with two different Derived classes? — Arpegius, Jul 02 '13 at 10:31
So what would be the typical sequence? Pass the istream to a class-creator to be read in, it fails creation (due to not matching required parameters), so rewind the istream and try the next class-creator? — icabod, Jul 02 '13 at 10:34
@icabod, that's the main idea, yes. This is rather inefficient, but the functional requirement is to construct objects of - as of yet undefined - derived classes from a single function (e.g. the factory). The file format may be vastly different between subclasses and this is why I believe the logic to decide on which subclass to instantiate cannot be coded in the factory itself. — Marc Claesen, Jul 02 '13 at 10:37
@Arpegius the parsing will not be performed by us, so I can't say. In our use-case, matching multiple derived classes should not occur, so I plan to make the factory return whatever matches first. Some comments have suggested ways to prevent multiple derived classes to match given input. Those are quite interesting but overkill in our setting. — Marc Claesen, Jul 02 '13 at 10:39
Ok, I understand. I based my answer on the assumption there would be some kind of [magic](https://en.wikipedia.org/wiki/File_format#Magic_number) which could be read initially and passed to the factory, which would use a lookup table to construct the desired item. — icabod, Jul 02 '13 at 10:42
@MarcClaesen There is no good solution then. The most common idea is to read some buffer from `istream` and provide it to each registered `Parser::feed` until one is completed. You cannot prevent multiple copy of parsed parameters nor the copy of buffers, unless you try construct one parser with parts provided by plugins. — Arpegius, Jul 02 '13 at 10:57
/me wonders if a solution involving boost::spirit would be overkill. — icabod, Jul 02 '13 at 11:37

Suedocode · Accepted Answer · 2013-07-03T16:08:36.047

If I understand this correctly, we want a factory function that can select which derived class to instantiate based on constructor inputs. This is the most generic solution that I could come up with so far. You specify mapping inputs to organize factory functions, and then you can specify constructor inputs upon factory invocation. I hate to say that the code explains more than I could in words, however I think the example implementations of FactoryGen.h in Base.h and Derived.h are clear enough with the help of comments. I can provide more details if necessary.

FactoryGen.h

#pragma once

#include <map>
#include <tuple>
#include <typeinfo>

//C++11 typename aliasing, doesn't work in visual studio though...
/*
template<typename Base>
using FactoryGen<Base> = FactoryGen<Base,void>;
*/

//Assign unique ids to all classes within this map.  Better than typeid(class).hash_code() since there is no computation during run-time.
size_t __CLASS_UID = 0;

template<typename T>
inline size_t __GET_CLASS_UID(){
    static const size_t id = __CLASS_UID++;
    return id;
}

//These are the common code snippets from the factories and their specializations. 
template<typename Base>
struct FactoryGenCommon{
    typedef std::pair<void*,size_t> Factory; //A factory is a function pointer and its unique type identifier

    //Generates the function pointer type so that I don't have stupid looking typedefs everywhere
    template<typename... InArgs>
    struct FPInfo{ //stands for "Function Pointer Information"
        typedef Base* (*Type)(InArgs...);
    };

    //Check to see if a Factory is not null and matches it's signature (helps make sure a factory actually takes the specified inputs)
    template<typename... InArgs>
    static bool isValid(const Factory& factory){
        auto maker = factory.first;
        if(maker==nullptr) return false;

        //we have to check if the Factory will take those inArgs
        auto type = factory.second;
        auto intype = __GET_CLASS_UID<FPInfo<InArgs...>>();
        if(intype != type) return false;

        return true;
    }
};

//template inputs are the Base type for which the factory returns, and the Args... that will determine how the function pointers are indexed.
template<typename Base, typename... Args> 
struct FactoryGen : FactoryGenCommon<Base>{
    typedef std::tuple<Args...> Tuple;
    typedef std::map<Tuple,Factory> Map; //the Args... are keys to a map of function pointers

    inline static Map& get(){ 
        static Map factoryMap;
        return factoryMap; 
    }

    template<typename... InArgs>
    static void add(void* factory, const Args&... args){
        Tuple selTuple = std::make_tuple(args...); //selTuple means Selecting Tuple.  This Tuple is the key to the map that gives us a function pointer
        get()[selTuple] = Factory(factory,__GET_CLASS_UID<FPInfo<InArgs...>>());
    }

    template<typename... InArgs>
    static Base* make(const Args&... args, const InArgs&... inArgs){
        Factory factory = get()[std::make_tuple(args...)];
        if(!isValid<InArgs...>(factory)) return nullptr;
        return ((FPInfo<InArgs...>::Type)factory.first) (inArgs...);
    }
};

//Specialize for factories with no selection mapping
template<typename Base>
struct FactoryGen<Base,void> : FactoryGenCommon<Base>{
    inline static Factory& get(){
        static Factory factory;
        return factory; 
    }

    template<typename... InArgs>
    static void add(void* factory){
        get() = Factory(factory,__GET_CLASS_UID<FPInfo<InArgs...>>());
    }

    template<typename... InArgs>
    static Base* make(const InArgs&... inArgs){
        Factory factory = get();
        if(!isValid<InArgs...>(factory)) return nullptr;
        return ((FPInfo<InArgs...>::Type)factory.first) (inArgs...);
    }
};

//this calls the function "initialize()" function to register each class ONCE with the respective factory (even if a class tries to initialize multiple times)
//this step can probably be circumvented, but I'm not totally sure how
template <class T>
class RegisterInit {
  int& count(void) { static int x = 0; return x; } //counts the number of callers per derived
public:
  RegisterInit(void) { 
    if ((count())++ == 0) { //only initialize on the first caller of that class T
      T::initialize();
    }
  }
};

Base.h

#pragma once

#include <map>
#include <string>
#include <iostream>
#include "Procedure.h"
#include "FactoryGen.h"

class Base {
public:
    static Base* makeBase(){ return new Base; }
    static void initialize(){ FactoryGen<Base,void>::add(Base::makeBase); } //we want this to be the default mapping, specify that it takes void inputs

    virtual void speak(){ std::cout << "Base" << std::endl; }
};

RegisterInit<Base> __Base; //calls initialize for Base

Derived.h

#pragma once

#include "Base.h"

class Derived0 : public Base {
private:
    std::string speakStr;
public:
    Derived0(std::string sayThis){ speakStr=sayThis; }

    static Base* make(std::string sayThis){ return new Derived0(sayThis); }
    static void initialize(){ FactoryGen<Base,int>::add<std::string>(Derived0::make,0); } //we map to this subclass via int with 0, but specify that it takes a string input

    virtual void speak(){ std::cout << speakStr << std::endl; }
};

RegisterInit<Derived0> __d0init; //calls initialize() for Derived0

class Derived1 : public Base {
private:
    std::string speakStr;
public:
    Derived1(std::string sayThis){ speakStr=sayThis; }

    static Base* make(std::string sayThat){ return new Derived0(sayThat); }
    static void initialize(){ FactoryGen<Base,int>::add<std::string>(Derived0::make,1); } //we map to this subclass via int with 1, but specify that it takes a string input

    virtual void speak(){ std::cout << speakStr << std::endl; }
};

RegisterInit<Derived1> __d1init; //calls initialize() for Derived1

Main.cpp

#include <windows.h> //for Sleep()
#include "Base.h"
#include "Derived.h"

using namespace std;

int main(){
    Base* b = FactoryGen<Base,void>::make(); //no mapping, no inputs
    Base* d0 = FactoryGen<Base,int>::make<string>(0,"Derived0"); //int mapping, string input
    Base* d1 = FactoryGen<Base,int>::make<string>(1,"I am Derived1"); //int mapping, string input

    b->speak();
    d0->speak();
    d1->speak();

    cout << "Size of Base: " << sizeof(Base) << endl;
    cout << "Size of Derived0: " << sizeof(Derived0) << endl;

    Sleep(3000); //Windows & Visual Studio, sry
}

I think this is a pretty flexible/extensible factory library. While the code for it is not very intuitive, I think using it is fairly simple. Of course, my view is biased seeing as I'm the one that wrote it, so please let me know if it is the contrary.

EDIT : Cleaned up the FactoryGen.h file. This is probably my last update, however this has been a fun exercise.

Correct. It's not complete because the author's goal was to create a factory that would create the right object based on the right arguments. Those arguments not being the name of the class to create. — Aleph, Jul 01 '13 at 16:39
Hopefully this new edited answer aligns more with the goals of the OP. — Suedocode, Jul 02 '13 at 01:00
Looking at the OP's original solution, I'm wondering if I should extend this further and map factory outputs based on input type values for each input type rather than just unique input types. — Suedocode, Jul 02 '13 at 04:08
@Aggieboy, thanks for your answer. I think there may be a slight misinterpretation of what I am looking for. The idea is to use the same arguments to construct *all* derived classes, for example a single `istream`. The selection of which derived class to construct is based on the contents of said file. For a general solution - this selection can be quite complex. This is why I try to keep such logic out of the factory method itself (hence using fptrs). — Marc Claesen, Jul 02 '13 at 08:15
@MarcClaesen, I don't understand this quote completely: *"based on some input, create an instance of a class that accepts that input."* Are you creating an instance of a class that accepts that input type, or that input type & value? Your original solution suggests the latter, so I will try to form an answer according to that. — Suedocode, Jul 02 '13 at 14:40
@Aggieboy In this context, all derived classes accept the same input type, so yes, the value of the input is also relevant. — Marc Claesen, Jul 02 '13 at 15:42
@MarcClaesen, so I went a little overboard and allowed the mapping inputs to be different from the invocation inputs (although it is easy to make them the same type & value). You could just typedef/macro/metaprogrammed-wrapper to get the exact behavior that you want. — Suedocode, Jul 02 '13 at 16:54

Aleph · Answer 2 · 2013-07-01T16:27:00.473

My comments were probably not very clear. So here is a C++11 "solution" relying on template meta programming : (Possibly not the nicest way of doing this though)

#include <iostream>
#include <utility>


// Type list stuff: (perhaps use an existing library here)
class EmptyType {};

template<class T1, class T2 = EmptyType>
struct TypeList
{
    typedef T1 Head;
    typedef T2 Tail;
};

template<class... Etc>
struct MakeTypeList;

template <class Head>
struct MakeTypeList<Head>
{
    typedef TypeList<Head> Type;
};

template <class Head, class... Etc>
struct MakeTypeList<Head, Etc...>
{
    typedef TypeList<Head, typename MakeTypeList<Etc...>::Type > Type;
};

// Calling produce
template<class TList, class BaseType>
struct Producer;

template<class BaseType>
struct Producer<EmptyType, BaseType>
{
    template<class... Args>
    static BaseType* Produce(Args... args)
    {
        return nullptr;
    }
};

template<class Head, class Tail, class BaseType>
struct Producer<TypeList<Head, Tail>, BaseType>
{
    template<class... Args>
    static BaseType* Produce(Args... args)
    {
        BaseType* b = Head::Produce(args...);
        if(b != nullptr)
            return b;
        return Producer<Tail, BaseType>::Produce(args...);
    }
};

// Generic AbstractFactory:
template<class BaseType, class Types>
struct AbstractFactory {
    typedef Producer<Types, BaseType> ProducerType;

    template<class... Args>
    static BaseType* Produce(Args... args)
    {
        return ProducerType::Produce(args...);
    }
};

class Base {}; // Example base class you had

struct Derived0 : public Base { // Example derived class you had
    Derived0() = default;
    static Base* Produce(int value)
    {
        if(value == 0)
            return new Derived0();
        return nullptr;
    }
};

struct Derived1 : public Base { // Another example class
    Derived1() = default;
    static Base* Produce(int value)
    {
        if(value == 1)
            return new Derived1();
        return nullptr;
    }
};

int main()
{
    // This will be our abstract factory type:
    typedef AbstractFactory<Base, MakeTypeList<Derived0, Derived1>::Type> Factory;
    Base* b1 = Factory::Produce(1);
    Base* b0 = Factory::Produce(0);
    Base* b2 = Factory::Produce(2);
    // As expected b2 is nullptr
    std::cout << b0 << ", " << b1 << ", " << b2 << std::endl;
}

Advantages:

No (additional) run-time overhead as you would have with the function pointers. Works for any base type, and for any number of derived types. You still end up calling the functions of course.
Thanks to variadic templates this works with any number of arguments (giving an incorrect number of arguments will produce a compile-time error message).
Explicit registering of the produce member functions is not required.

Disadvantages:

All of your derived types must be available when you declare the Factory type. (You must know what the possible derived types are and they must be complete.)
The produce member functions for the derived types must be public.
Can make compilation slower. (As always the case when relying on template metaprogramming)

In the end, using the prototype design pattern might turn out better. I don't know since I haven't tried using my code.

I'd like to state some additional things (after further discussion on the chat):

Each factory can only return a single object. This seems strange, as the users decide whether they will take the input to create their object or not. I would for that reason suggest your factory can return a collection of objects instead.
Be careful not to overcomplicate things. You want a plugin system, but I don't think you really want factories. I would propose you simply make users register their classes (in their shared object), and that you simply pass the arguments to the classes' Produce (static) member functions. You store the objects if and only if they're not the nullptr.

Thanks for your answer! *All of your derived types must be available when you declare the Factory type.* this is unfortunately a big burden for downstream developers: *The tricky part is that I want said factories to work for user-defined subclasses as well.* I want to avoid explicitly having to make a list of all derived types, especially considering this burden would be put on users, rather than myself. I fear that this requirement is likely to be misunderstood and unintentionally abused. — Marc Claesen, Jul 01 '13 at 13:20
Well, it would work for user-defined subclasses, unless with "user-defined" you mean "through some kind of plugin system". That would complicate matters of course. Although it might still be possible. Not sure if this is what you mean with user-defined though. — Aleph, Jul 01 '13 at 13:41
that is indeed what I am referring to. I will modify the original question to make this more clear. Thanks for pointing that out! — Marc Claesen, Jul 01 '13 at 13:43
@MarcClaesen Be careful! This may cause serious problems. I'm assuming you're loading a shared object or something? Note that someone could break all other plugins by creating multiple classes that both do not return a nullptr. — Aleph, Jul 01 '13 at 13:45
Yes, that is indeed a concern I am aware of. It is still a functional requirement, though, so I'll have to deal with that somehow -- probably through extensive documentation of don'ts. — Marc Claesen, Jul 01 '13 at 13:47
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/32684/discussion-between-anothertest-and-marc-claesen) — Aleph, Jul 01 '13 at 13:49
@AnotherTest: You say there is no runtime overhead, but doesn't it call each of the derived `Produce(..)` methods in turn until one returns a non-null pointer? For a large number of derived classes, that could be quite a lot of runtime overhead. Or is this taken care of with the meta-programming? — icabod, Jul 01 '13 at 16:04
@icabod I meant no additional overhead. You still have to call the functions anyway, regardless of the approach. The main problem with the question here is that there can only be one resulting Base type, which seems to be incorrect. — Aleph, Jul 01 '13 at 16:08

icabod · Answer 3 · 2013-07-02T11:03:32.700

Update: This answer made the assumption that some kind of magic existed that could be read and passed to the factory, but that's apparently not the case. I'm leaving the answer here because a) I may update it, and b) I like it anyway.

Not hugely different from your own answer, not using C++11 techniques (I've not had a chance to update it yet, or have it return a smart pointer, etc), and not entirely my own work, but this is the factory class I use. Importantly (IMHO) it doesn't call each possible class's methods to find the one that matches - it does this via the map.

#include <map>
// extraneous code has been removed, such as empty constructors, ...
template <typename _Key, typename _Base, typename _Pred = std::less<_Key> >
class Factory {
public:
    typedef _Base* (*CreatorFunction) (void);
    typedef std::map<_Key, CreatorFunction, _Pred> _mapFactory;

    // called statically by all classes that can be created
    static _Key Register(_Key idKey, CreatorFunction classCreator) {
        get_mapFactory()->insert(std::pair<_Key, CreatorFunction>(idKey, classCreator));
        return idKey;
    }

    // Tries to create instance based on the key
    static _Base* Create(_Key idKey) {
        _mapFactory::iterator it = get_mapFactory()->find(idKey);
        if (it != get_mapFactory()->end()) {
            if (it->second) {
                return it->second();
            }
        }
        return 0;
    }

protected:
    static _mapFactory * get_mapFactory() {
        static _mapFactory m_sMapFactory;
        return &m_sMapFactory;
    }
};

To use this you just declare the base-type, and for each class you register it as a static. Note that when you register, the key is returned, so I tend to add this as a member of the class, but it's not necessary, just neat :) ...

// shape.h
// extraneous code has been removed, such as empty constructors, ...
// we also don't technically need the id() method, but it could be handy
// if at a later point you wish to query the type.
class Shape {
public:
    virtual std::string id() const = 0;
};
typedef Factory<std::string, Shape> TShapeFactory;

Now we can create a new derived class, and register it as creatable by TShapeFactory...

// cube.h
// extraneous code has been removed, such as empty constructors, ...
class Cube : public Shape {
protected:
    static const std::string _id;
public:
    static Shape* Create() {return new Cube;}
    virtual std::string id() const {return _id;};
};

// cube.cpp
const std::string Cube::_id = TShapeFactory::Register("cube", Cube::Create);

Then we can create a new item based on, in this case, a string:

Shape* a_cube = TShapeFactory::Create("cube");
Shape* a_triangle = TShapeFactory::Create("triangle");
// a_triangle is a null pointer, as we've not registered a "triangle"

The advantage of this method is that if you create a new derived, factory-generatable class, you don't need to change any other code, providing you can see the factory class and derive from the base:

// sphere.h
// extraneous code has been removed, such as empty constructors, ...
class Sphere : public Shape {
protected:
    static const std::string _id;
public:
    static Shape* Create() {return new Sphere;}
    virtual std::string id() const {return _id;};
};

// sphere.cpp
const std::string Sphere::_id = TShapeFactory::Register("sphere", Sphere::Create);

Possible improvements that I'll leave to the reader include adding things like: typedef _Base base_class to Factory, so that when you've declared your custom factory, you can make your classes derive from TShapeFactory::base_class, and so on. The Factory should probably also check if a key already exists, but again... it's left as an exercise.

Problem with this approach seems to be that you don't pass any parameters to the create functions. I think the author's intention is that all classes get a chance at trying to process the input. That is, the derived classes check whether the input is meant for them or not. — Aleph, Jul 01 '13 at 16:13
@AnotherTest: Ah, I think I understand. I interpreted the question as meaning "based on some input, create the required class", but you're saying the question is more along the lines of "based on some input, create an instance of a class that accepts that input"? I guess that's the "extensible" part of the question... off to wikipedia go I. — icabod, Jul 01 '13 at 16:20
no you got it right. But the code you wrote here doesn't accept any input. Thus their creation can't depend on it. — Aleph, Jul 01 '13 at 16:23
@icabot Your `Cube/Sphere::Create` functions don't take any arguments. — Aleph, Jul 01 '13 at 16:35
@icabod *"based on some input, create an instance of a class that accepts that input"*, Yes, this is exactly what I am trying to accomplish. Thanks for your ideas so far, the main concept is clear and applicable. — Marc Claesen, Jul 02 '13 at 08:11

score 3 · Answer 4 · answered Jun 29 '13 at 09:57

3

The best solution I can currently think of is by using a Factory class which stores pointers to producing functions for each derived class. When a new derived class is made, a function pointer to a producing method can be stored in the factory.

Here is some code to illustrate my approach:

#include <iostream>
#include <vector>

class Base{};

// Factory class to produce Base* objects from an int (for simplicity).
// The class uses a list of registered function pointers, which attempt
// to produce a derived class based on the given int.
class Factory{
public:
    typedef Base*(*ReadFunPtr)(int);
private:
    static vector<ReadFunPtr> registeredFuns;
public:
    static void registerPtr(ReadFunPtr ptr){ registeredFuns.push_back(ptr); }
    static Base* Produce(int value){
        Base *ptr=NULL;
        for(vector<ReadFunPtr>::const_iterator I=registeredFuns.begin(),E=registeredFuns.end();I!=E;++I){
            ptr=(*I)(value);
            if(ptr!=NULL){
                return ptr;
            }
        }
        return NULL;
    }
};
// initialize vector of funptrs
std::vector<Factory::ReadFunPtr> Factory::registeredFuns=std::vector<Factory::ReadFunPtr>();

// An example Derived class, which can be produced from an int=0. 
// The producing method is static to avoid the need for prototype objects.
class Derived : public Base{
    private:
        static Base* ProduceDerivedFromInt(int value){ 
            if(value==0) return new Derived();
            return NULL;
        }
public:
    Derived(){};

    // registrar is a friend because we made the producing function private
    // this is not necessary, may be desirable (e.g. encapsulation)
    friend class DerivedRegistrar;
};

// Register Derived in the Factory so it will attempt to construct objects.
// This is done by adding the function pointer Derived::ProduceDerivedFromInt
// in the Factory's list of registered functions.
struct DerivedRegistrar{ 
    DerivedRegistrar(){ 
        Factory::registerPtr(&(Derived::ProduceDerivedFromInt));
    }
} derivedregistrar;

int main(){
    // attempt to produce a Derived object from 1: should fail
    Base* test=Factory::Produce(1);
    std::cout << test << std::endl; // outputs 0

    // attempt to produce a Derived object from 0: works
    test=Factory::Produce(0);
    std::cout << test << std::endl; // outputs an address
}

TL;DR: in this approach, downstream developers need to implement the producing function of a derived class as a static member function (or a non-member function) and register it in the factory using a simple struct.

This seems simple enough and does not require any prototype objects.

answered Jun 29 '13 at 09:57

Marc Claesen

16,778
6
27
62

Are you suggest this is the answer or simply saying this is what you currently have but want something better? if the latter, please add this to the question itself, and delete this non-answer. – thecoshman Jul 01 '13 at 13:19
As stated literally in the question, this is what I currently use to solve the problem but I am looking for better solutions. It does solve the problem, but is not as clean as I think other solutions could be. – Marc Claesen Jul 01 '13 at 13:21
This solution is OK to me. What you may do to improve it is along these lines: Pull out the integer and put it at registration time, and let the factory select the appropriate factory method. In C+11, make your functions return a `unique_ptr`. – Laurent LA RIZZA Jul 01 '13 at 13:30
2

It seems to me that you wrote a lot of code to mimic an inefficient `std::map>` – rectummelancolique Jul 01 '13 at 13:34
@rectummelancolique The int was just a simple example. In practice it's going to be used to read from streams and so on, in which cases deciding which subclass to instantiate is far from trivial. – Marc Claesen Jul 01 '13 at 13:41
So more like `std::set, std::function>` then? Or in a more condensed and readable way: `set>`. This would even allow your factory to determine wether your user registers a `Predicate` that clashes with one that is already in there. – rectummelancolique Jul 01 '13 at 13:55
`set` may not be the right type in my previous comment. Basically my point is thus: it seems odd to have your predicate and the construction step embedded in the same function, fundamentally what you need is a mapping between a predicate and constructor which will be called when the predicate matches. So why not expressing it exactly like that? – rectummelancolique Jul 01 '13 at 14:04
@rectummelancolique you are right, I will change the code in this answer later. Thanks a lot for the comments! – Marc Claesen Jul 01 '13 at 14:09
@MarcClaesen as stated literally in the comment, as this is not a solution but what you have already tried it should be in the question. – thecoshman Jul 01 '13 at 15:13
1

@thecoshman: I disagree - there is nothing wrong with asking a question and proposing your own answer. This is a valid answer, the question is asking for other possible solutions. – icabod Jul 02 '13 at 09:54
@icabod yes, but this is not a typical Q and A example, this answer should be part of the question, as it would impact how others answer. – thecoshman Jul 02 '13 at 13:07
@MarcClaesen, I'm not sure if what you want is really want you want. Say in FileA we have two registered class `Foo1(int)` and `Foo2(int)`, and they both have int-ctor. If we pass in an int-parameter, which class should we return? Looking at your code, we return the first match. Assume `Foo1` matches first. If we wanted `Foo2` then there would be no way to get it. Passing only parameters and trying to match a class, sounds like a big future bug, which would cause for a redesign when problems start occurring. Is the said functionality really what you want? Maybe I totally misinterpreted? – dchhetri Jul 02 '13 at 18:22
@user814628 the `int` version was just a simple example, in practice it is to be used with `istream`s and the like. *All* derived classes will be constructable from an `istream`, but which one to construct depends entirely on *the content of the stream* -- there should always be only one possible match for a given stream. – Marc Claesen Jul 02 '13 at 20:49
@MarcClaesen Oh so the arguments in the stream are unique in a sense that the list of arguments in the stream is a one to one mapping to a constructor? Do you have something to handle when there is a conflict? – dchhetri Jul 02 '13 at 21:08

Tom Kerr · Answer 5 · 2013-07-03T04:53:05.347

Here is a sustainable idiom for managing factories that resolve at runtime. I've used this in the past to support fairly sophisticated behavior. I favor simplicity and maintainability without giving up much in the way of functionality.

TLDR:

Avoid static initialization in general
Avoid "auto-loading" techniques like the plague
Communicate ownership of objects AND factories
Separate usage and factory management concerns

Using Runtime Factories

Here is the base interface that users of this factory system will interact with. They shouldn't need to worry about the details of the factory.

class BaseObject {
public:
    virtual ~BaseObject() {}
};

BaseObject* CreateObjectFromStream(std::istream& is);

As an aside, I would recommend using references, boost::optional, or shared_ptr instead of raw pointers. In a perfect world, the interface should tell me who owns this object. As a user, am I responsible for deleting this pointer when it's given to me? It's painfully clear when it's a shared_ptr.

Implementing Runtime Factories

In another header, put the details of managing the scope of when the factories are active.

class RuntimeFactory {
public:
    virtual BaseObject* create(std::istream& is) = 0;
};

void RegisterRuntimeFactory(RuntimeFactory* factory);
void UnregisterRuntimeFactory(RuntimeFactory* factory);

I think the salient point in all of this is that usage is a different concern from how the factories are initialized and used.

We should note that the callers of these free functions own the factories. The registry does not own them.

This isn't strictly necessary, though it offers more control when and where these factories get destroyed. The point where it matters is when you see things like "post-create" or "pre-destroy" calls. Factory methods with these sorts of names are design smells for ownership inversion.

Writing another wrapper around this to manage the factories life-time would be simple enough anyway. It also lends to composition, which is better.

Registering Your New Factory

Write wrappers for each factory registration. I usually put each factory registration in its own header. These headers are usually just two function calls.

void RegisterFooFactory();
void UnregisterFooFactory();

This may seem like overkill, but this sort of diligence keeps your compile times down.

My main then is reduced to a bunch of register and unregister calls.

#include <foo_register.h>
#include <bar_register.h>

int main(int argc, char* argv[]) {
    SetupLogging();
    SetupRuntimeFactory();
    RegisterFooFactory();
    RegisterBarFactory();

    // do work...

    UnregisterFooFactory();
    UnregisterBarFactory();
    CleanupLogging();
    return 0;
}

Avoid Static Init Pitfalls

This specifically avoids objects created during static loading like some of the other solutions. This is not an accident.

The C++ spec won't give you useful assurances about when static loading will occur
You'll get a stack trace when something goes wrong
The code is simple, direct, easy to follow

Implementing the Registry

Implementation details are fairly mundane, as you'd imagine.

class RuntimeFactoryRegistry {
public:
    void registerFactory(RuntimeFactory* factory) {
        factories.insert(factory);
    }

    void unregisterFactory(RuntimeFactory* factory) {
        factories.erase(factory);
    }

    BaseObject* create(std::istream& is) {
        std::set<RuntimeFactory*>::iterator cur = factories.begin();
        std::set<RuntimeFactory*>::iterator end = factories.end();
        for (; cur != end; cur++) {
            // reset input?
            if (BaseObject* obj = (*cur)->create(is)) {
                return obj;
            }
        }
        return 0;
    }

private:
    std::set<RuntimeFactory*> factories;
};

This assumes that all factories are mutually exclusive. Relaxing this assumption is unlikely to result in well-behaving software. I'd probably make stronger claims in person, hehe. Another alternative would be to return a list of objects.

The below implementation is static for simplicity of demonstration. This can be a problem for multi-threaded environments. It doesn't have to be static, nor do I recommend it should or shouldn't be static, it just is here. It isn't really the subject of the discussion, so I'll leave it at that.

These free functions only act as pass-through functions for this implementation. This lets you unit test the registry or reuse it if you were so inclined.

namespace {

    static RuntimeFactoryRegistry* registry = 0;

} // anon    

void SetupRuntimeFactory() {
    registry = new RuntimeFactoryRegistry;
}

void CleanupRuntimeFactory() {
    delete registry;
    registry = 0;
}

BaseObject* CreateObjectFromStream(std::istream& is) {
    return registry->create(is);
}

void RegisterRuntimeFactory(RuntimeFactory* factory) {
    registry->registerFactory(factory);
}

void UnregisterRuntimeFactory(RuntimeFactory* factory) {
    registry->unregisterFactory(factory);
}

In your RuntimeFactory interface, it should contain `registerFactory` and `unregisterFactory` virtual function right and not just be a 'emtpy' interface as shown in its definition — dchhetri, Jul 03 '13 at 03:56
@user814628 They are not part of the class, nor are they intended to be. `RuntimeFactory` is only concerned with creating `BaseObject` instances. — Tom Kerr, Jul 03 '13 at 04:06
I'm confused on how `registry->registerFactory` is a functional code in your RegisterRuntimeFactory free-function, if RuntimeFactory does not contain that method? **EDIT: nvmd got it** — dchhetri, Jul 03 '13 at 04:13
Also maybe thread concern should be mentioned or given some disclaimer with this impl — dchhetri, Jul 03 '13 at 04:15
I like your main ideas. I think your concern about the order in which static loading occurs can be alleviated, atleast in part, by using dynamic allocation of members that *must* be loaded prior to everything else http://stackoverflow.com/questions/17443163/insert-content-in-static-containers-of-a-templated-class-prior-to-main — Marc Claesen, Jul 03 '13 at 11:54
@MarcClaesen I wouldn't recommend relying on it, alleviated or no. I've spent my share of time debugging these sorts of things in the wild after they break. I assure you that I'm not just expressing a personal preference. It's cheaper to use simple and direct methods. — Tom Kerr, Jul 03 '13 at 15:40

score 1 · Answer 6 · answered Jul 03 '13 at 06:37

First, there's not really enough detail here to form an opinion, so I'm left to guess. You've provided a challenging question and a minimal solution, but not clarified what is wrong with your solution.

I suspect the complaint centers around the reset back to knowing nothing between a refused construction and the following construction attempts. Given a very large number of potential factories this reset could have us parsing the same data hundreds or thousands of times. If this is the problem the question is this: how do you structure the predicate evaluation phase to limit the amount of work, and allow it to reuse previous parsing results.

I suggest having each factory register with: 1) a factory builder function taking the specialization parameter(s) (iostream in the example) 2) an unordered set of boolean predicates 3) required boolean values of each predicate to allow construction

The set of predicates is used to create/modify the predicate tree. Interior nodes in the tree represent predicates (branching to 'pass', 'fail', and possibly 'don't care'). Both interior nodes and leaves hold constructors which are satisfied if the ancestral predicates are satisfied. As you traverse the tree you first look for constructors at the current level, then evaluate the predicate and follow the required path. If no solution is found along that child path the follow the 'don't care' path.

This allows new factories to share predicate functions. There's probably lots of questions about managing/sorting the tree when the factories go on/off line. There's also the possibility of parser state data that needs to be retained across predicates and reset when construction is completed. There's lots of open questions, but this may work toward addressing the perceived problems with your solution.

TL:DR; Create a graph of predicates to traverse when attempting construction.

tp1 · Answer 7 · 2013-07-02T18:06:35.717

Simple solution is just a switch-case:

Base *create(int type, std::string data) {
  switch(type) { 
   case 0: return new Derived1(data); 
   case 1: return new Derived2(data);
  };
}

But then it's just deciding which type you want:

   int type_of_obj(string s) {
      int type = -1;
      if (isderived1(s)) type=0;
      if (isderived2(s)) type=1;
      return type;
   }

Then it's just connecting the two:

Base *create_obj(string s, string data, 
                 Base *(*fptr)(int type, string data), 
                 int (*fptr2)(string s)) 
{
   int type = fptr2(s);
   if (type==-1) return 0;
   return fptr(type, data);   
}

Then it's just registering the function pointers:

   class Registry {
   public:
       void push_back(Base* (*fptr)(int type, string data),
                      int (*fptr2)(string s));
       Base *create(string s, string data);
   };

The plugin will have the 2 functions, and the following:

void register_classes(Registry &reg) {
    reg.push_back(&create, &type_of_obj);
    ...
}

Plugin loader will dlopen/dlsym the register_classes functions.

(on the other hand, I'm not using this kind of plugins myself because creating new plugins is too much work. I have better way to provide modularity for my program's pieces. What kills plugins is the fact that you need to modify your build system to create new dll's or shared_libs, and doing that is just too much work - ideally new module is just one class; without anything more complicated build system modifications)

Elegant way to implement extensible factories in C++

7 Answers7

Linked