4

Please consider the following code:

struct A
{
    virtual ~A() {}
    virtual int go() = 0;
};

struct B : public A { int go() { return 1; } };

struct C : public B { int go() { return 2; } };

int main()
{
    B b;
    B &b_ref = b;

    return b_ref.go();
}

Under GCC 4.4.1 (using -O2), the call to B::go() gets inlined (ie., no virtual dispatch happens). That means the compiler acknowledges a_ref indeed points to a B type variable. A B reference can be used to point to a C, but the compiler is smart enough to foresee this is not the case, so it totally optimizes away the function call, inlining the function.

Great! That's an incredible optimization.

But, then, why doesn't GCC do the same in the following case?

struct A
{
    virtual ~A() {}
    virtual int go() = 0;
};

struct B : public A { int go() { return 1; } };

struct C : public B { int go() { return 2; } };

int main()
{
    B b;
    A &b_ref = b;

    return b_ref.go(); // B::go() is not inlined here, and a virtual dispatch is issued
}

Any ideas? What about other compilers? Is this kind of optimization common? (I'm very new into this kind of compiler insight, so I'm curious)

If the second case worked I could create some really great templates, like these:

template <typename T>
class static_ptr_container
{
public:
    typedef T st_ptr_value_type;

    operator T *() { return &value; }
    operator const T *() const { return &value; }

    T *operator ->() { return &value; }
    const T *operator ->() const { return &value; }

    T *get() { return &value; }
    const T *get() const { return &value; }

private:
    T value;
};

template <typename T>
class static_ptr
{
public:
    typedef static_ptr_container<T> container_type;
    typedef T st_ptr_value_type;

    static_ptr() : container(NULL) {}
    static_ptr(container_type *c) : container(c) {}

    inline operator st_ptr_value_type *() { return container->get(); }
    inline st_ptr_value_type *operator ->() { return container->get(); }

private:
    container_type *container;
};

template <typename T>
class static_ptr<static_ptr_container<T>>
{
public:
    typedef static_ptr_container<T> container_type;
    typedef typename container_type::st_ptr_value_type st_ptr_value_type;

    static_ptr() : container(NULL) {}
    static_ptr(container_type *c) : container(c) {}

    inline operator st_ptr_value_type *()  { return container->get(); }
    inline st_ptr_value_type *operator ->()  { return container->get(); }

private:
    container_type *container;
};

template <typename T>
class static_ptr<const T>
{
public:
    typedef const static_ptr_container<T> container_type;
    typedef const T st_ptr_value_type;

    static_ptr() : container(NULL) {}
    static_ptr(container_type *c) : container(c) {}

    inline operator st_ptr_value_type *() { return container->get(); }
    inline st_ptr_value_type *operator ->() { return container->get(); }

private:
    container_type *container;
};

template <typename T>
class static_ptr<const static_ptr_container<T>>
{
public:
    typedef const static_ptr_container<T> container_type;
    typedef typename container_type::st_ptr_value_type st_ptr_value_type;

    static_ptr() : container(NULL) {}
    static_ptr(container_type *c) : container(c) {}

    inline operator st_ptr_value_type *() { return container->get(); }
    inline st_ptr_value_type *operator ->() { return container->get(); }

private:
    container_type *container;
};

These templates could be used to avoid virtual dispatch in many cases:

// without static_ptr<>
void func(B &ref);

int main()
{
    B b;
    func(b); // since func() can't be inlined, there is no telling I'm not
             // gonna pass it a reference to a derivation of `B`

    return 0;
}

// with static_ptr<>
void func(static_ptr<B> ref);

int main()
{
    static_ptr_container<B> b;
    func(b); // here, func() could inline operator->() from static_ptr<> and
             // static_ptr_container<> and be dead-sure it's dealing with an object
             // `B`; in cases func() is really *only* meant for `B`, static_ptr<>
             // serves both as a compile-time restriction for that type (great!)
             // AND as a big runtime optimization if func() uses `B`'s
             // virtual methods a lot -- and even gets to explore inlining
             // when possible

    return 0;
}

Would it be practical to implement that? (and don't go on saying it's a micro-optimization because it may well be a huge optimization..)

-- edit

I just noticed the problem with static_ptr<> has nothing to do with the problem I exposed. The pointer type is kept, but it still doesn't inline. I guess GCC just doesn't go as deep as needed to find out static_ptr_container<>::value is not a reference nor pointer. Sorry about that. But the question still remains unanswered.

-- edit

I've worked out a version of static_ptr<> that actually works. I've changed the name a bit, also:

template <typename T>
struct static_type_container
{
    // uncomment this constructor if you can't use C++0x
    template <typename ... CtorArgs>
    static_type_container(CtorArgs ... args)
            : value(std::forward<CtorArgs>(args)...) {}

    T value; // yes, it's that stupid.
};

struct A
{
    virtual ~A() {}
    virtual int go() = 0;
};

struct B : public A { int go() { return 1; } };

inline int func(static_type_container<Derived> *ptr)
{
    return ptr->value.go(); // B::go() gets inlined here, since
                            // static_type_container<Derived>::value
                            // is known to be always of type Derived
}

int main()
{
    static_type_container<Derived> d;
    return func(&d); // func() also gets inlined, resulting in main()
                     // that simply returns 1, as if it was a constant
}

The only weakness is that the user has to access ptr->value to get the actual object. Overloading operator ->() doesn't work in GCC. Any method returning a reference to the actual object, if it it's inline, breaks the optimization. What a pity..

Gui Prá
  • 5,559
  • 4
  • 34
  • 59
  • Are you worried about the cost of virtual function dispatch? The cost is practically not measurable. In most complex systems the cost of the extra look-up will be dwarfed by nearly any other operation that stalls the processor (which will happen a lot). Code clarity is much more important than speed in the majority of cases (and the extra speed gain would not be worth the extra complexity for a human to read the code). – Martin York Dec 07 '10 at 01:34
  • 1
    I read a lot about that recently and I think it's a compiler job... You should read about "C++ Static Oriented-Object Programming". They heavily use metaprogramming as you did. – Julio Guerra Dec 07 '10 at 01:40
  • If you look carefully, you'll see there is absolutely no extra look-up going on. I went into this line of reasoning because I wanted to write a common interface to be implemented using three different base libraries; since two implementations can be instanced, users could mix up the variables, using instances of `B` with `A`'s and make a complete mess, for example. Additionally, the whole system would go behind an interface, incurring virtual function dispatching for every little bit of the system. That's not what I want; it's a 3D graphics library. – Gui Prá Dec 07 '10 at 01:41
  • And before anyone says I'm hurting OOP since base classes should be substitutable by their derived ones, I'm only using inheritance because it was a way to unify the interfaces: I wanted to make sure all implementations can be used the same way, but can't be mixed. Another way to go would be duplicating information, but that would also be bad.. – Gui Prá Dec 07 '10 at 01:45
  • @Julio Guerra: Thanks! Will be reading that non-stop.. (: – Gui Prá Dec 07 '10 at 01:54
  • I worry that the time your application saves (if any) will be seriously outwaied by the time (1) 'it took to write those 100 lines of code' plus (2) 'the time some engineer has to spend each year trying to understand what you are doing'. – Martin York Dec 07 '10 at 01:54
  • @Martin: the difference between a virtual function call and an out-of-line function call may not be huge, but the difference between either and an inlined function call - for trivial functions common in C++ (e.g. get/set) can often be an order of magnitude. – Tony Delroy Dec 07 '10 at 01:59
  • 2
    @MartinYork: Yes, I know. I wouldn't do those things in a commercial project, but this is for an experimental project of mine, which I use precisely to try out this sort of thing. After a few years of C++ it gets hard to gain insight in the language without doing things that wouldn't get you blamed by your colleagues or fired (: – Gui Prá Dec 07 '10 at 02:06
  • 1
    @n2liquid: my starting point has been http://homepages.fh-regensburg.de/~mpool/mpool08/submissions/Levillain.pdf - SCOOP2 paradigm - and then I followed references that interested me (and available for free on internet...). Its main purpose is to talk about genericity but with high performance, so static stuffs with metaprogramming. – Julio Guerra Dec 07 '10 at 02:28
  • I'm reading what appears to be the first paper on SCOOP: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.2.8789&rep=rep1&type=pdf. Exactly what I've been looking for lately. I'll read the link you sent later, since it seems more recent. – Gui Prá Dec 07 '10 at 02:41
  • @Tony: Maybe but. 1) Why do you have get/set methods that breaks encapsulation. 2) There are so many other things happening that can stall the processor that are several orders of magnitude larger that unless you can prove that there is performance hit is not even worth thinking about. (Optimize only if it is a problem and it rearly is (unless you are games programmer or predicting the weather). – Martin York Dec 07 '10 at 03:46
  • @Martin: you're second guessing everything. get/set methods are appropriate for some classes - e.g. std::string::size(). They're often abused by beginners, but what isn't? n2liquid commented above that his problem space is a 3D graphics library: profiling's great, but a good 3D lib should aim to approach "as fast as possible" to give the client app more freedom from performance concerns. It's also expressly self-educational, so it's reasonable to want to ask and understand the actual performance - it empowers educated decision-making. – Tony Delroy Dec 07 '10 at 04:04
  • @n2liquid: just a wild guess, but could it be because A doesn't provide a `go()` implementation and hence the compiler tags it as "out-of-line"? If you put an empty body in, does the call through the `A&` get inlined? – Tony Delroy Dec 07 '10 at 04:07
  • @Tony: Not really. The method is still not inlining, even when I implement A::go() (in the class body, of course). It would be a funny mistake if GCC were fooled by that, though. I'm trying to implement the SCOOP idiom as mentioned by Julio Guerra now. It's a bit overkill and awkward, but it's certainly worth looking into (it's another solution to my initial "problem" and many others). – Gui Prá Dec 07 '10 at 04:38
  • I you try to do so, use their lib (called Static) you can download on their website. They are using it on several projects. Good luck :-) – Julio Guerra Dec 08 '10 at 00:09
  • @JulioGuerra: Where can I find the lib? If it's http://www.lrde.epita.fr/, it seems to be offline :\ – Gui Prá Dec 08 '10 at 01:41
  • Exact. I contacted "one of them" to know how could we have it. I would like to take a look a the "Mini-Std" example. I notice you as soon as I get an answer. – Julio Guerra Dec 08 '10 at 03:23

1 Answers1

2

This is not a definite answer, but I thought I might post it anyway since it might be useful to some people.

Commenter Julio Guerra pointed out a C++ idiom (they call it a "paradigm" in the papers, but I think that's a bit too much) called Static C++ Object-Oriented Programming (SCOOP). I'll be posting this to give SCOOP more visibility.

SCOOP was invented to allow C++ programmers to get the best of both the OOP and the GP worlds by making both play well together in C++. It aims primarily at scientific programming because of the performance gain it can bring and because it can be used to increase code expressivity.

SCOOP makes C++ generic types emulate seemingly all aspects of traditional object-oriented programming — statically. This means template methods get to feature, for example, the ability to be properly overloaded and (apparently) issue much more proper error messages than those usually caused by your casual template function.

It can also be used to do some funny tricks such as conditional inheritance.

What I was trying to accomplish with static_ptr<> was precisely a type of static object-orientation. SCOOP moves that to a whole new level.

For those interested, I've found two papers talking about this: A Static C++ Object-Oriented Programming (SCOOP) Paradigm Mixing Benefits of Traditional OOP and Generic Programming and Semantics-Driven Genericity: A Sequel to the Static C++ Object-Oriented Programming Paradigm (SCOOP 2).

This idiom is not without its own faults, though: it's one of those uncommon things that should be your last resort since people will most likely have a tough time figuring what you did, etc. Your code will also get more verbose and you're likely to find yourself unable to do things you thought would be possible.

I'm sure it's still useful under some circumstances, though, not to mention real fun.

Happy template hacking.

Community
  • 1
  • 1
Gui Prá
  • 5,559
  • 4
  • 34
  • 59