4

I am looking for a simple way to reduce header coupling in a C++ project, which comes mostly due to (overused) class composition which of course requires complete type. For example:

// header A
class A
{
  B b; // requires header B
};

I have also considered interfaces and pimpl, but both imply some boilerplate code which I do not want to write/support manually (or is there a way to make this automatic?).

So I thought about replacing member with a pointer and a forward like class B* pB;, but this requires handling of object creation and deletion. Ok, I could use smart pointers for deletion (not auto_ptr though as it requires complete type at creation so say something like shared_ptr<class B> pB;), but how to be with the object creation now?

I could create the object in A's constructor, like pB = new B; but this is, again, manual, and what is worse, there could be several constructors... So I'm looking for a way to do this automatically, which would work as simple as changing B b; to autoobjptr<class B> pB; in A's definition without having to bother with pB instantiation.

I'm pretty sure this is not a new idea, so maybe you could give me a reference to a common solution or a discussion?

UPDATE: To clarify, I am not trying to break the dependency between A and B, but I want to avoid inclusion of B's header when one includes A's one. In practice, B is used in implementation of A, so a typical solution would be to create an interface or pimpl for A but I am looking for something easier for the moment.

UPDATE2: I suddenly realized that a lazy pointer such as proposed here would do the trick (too bad there is no standard implementation of this in say boost), when combined with virtual destructor (to permit incomplete type). I still do not get why there is no standard solution and feel like re-inventing the wheel...

UPDATE3: Suddenly, Sergey Tachenov came with a very simple solution (the accepted answer), though it took me half an hour to understand why it really works... If you remove A() constructor or define it inline in the header file, the magic won't work anymore (compliation error). I guess that when you define an explicit non-inline constructor, the construction of members (even the implicit ones) is done inside the same compilation unit (A.cpp) where the type B is complete. On the other hand, if your A constructor is inline, creation of the members must happen inside other compilation units and won't work as B is incomplete there. Well, this is logical, but now I'm curious - is this behavior defined by the C++ standard?

UPDATE4: Hopefully, the final update. Refer to the accepted answer and the comments for a discussion on the question above.

Community
  • 1
  • 1
Roman L
  • 3,006
  • 25
  • 37

5 Answers5

3

At first I got intrigued by this question as it looked like something really tricky to do, and all the comments about templates, dependencies and includes made sense. But then I tried to actually implement this and found it surprisingly easy. So either I misunderstood the question or the question has some special property of looking much harder than it really is. Anyway, here is my code.

This is the glorified autoptr.h:

#ifndef TESTPQ_AUTOPTR_H
#define TESTPQ_AUTOPTR_H

template<class T> class AutoPtr {
  private:
    T *p;
  public:
    AutoPtr() {p = new T();}
    ~AutoPtr() {delete p;}
    T *operator->() {return p;}
};

#endif // TESTPQ_AUTOPTR_H

Looks really simple and I wondered if it actually works, so I made a test case for it. Here is my b.h:

#ifndef TESTPQ_B_H
#define TESTPQ_B_H

class B {
  public:
    B();
    ~B();
    void doSomething();
};

#endif // TESTPQ_B_H

And b.cpp:

#include <stdio.h>
#include "b.h"

B::B()
{
  printf("B::B()\n");
}

B::~B()
{
  printf("B::~B()\n");
}

void B::doSomething()
{
  printf("B does something!\n");
}

Now for the A class that actually uses this. Here's a.h:

#ifndef TESTPQ_A_H
#define TESTPQ_A_H

#include "autoptr.h"

class B;

class A {
  private:
    AutoPtr<B> b;
  public:
    A();
    ~A();
    void doB();
};

#endif // TESTPQ_A_H

And a.cpp:

#include <stdio.h>
#include "a.h"
#include "b.h"

A::A()
{
  printf("A::A()\n");
}

A::~A()
{
  printf("A::~A()\n");
}

void A::doB()
{
  b->doSomething();
}

Ok, and finally the main.cpp that uses A, but doesn't include "b.h":

#include "a.h"

int main()
{
  A a;
  a.doB();
}

Now it actually compiles with no single error nor warning and works:

d:\alqualos\pr\testpq>g++ -c -W -Wall b.cpp
d:\alqualos\pr\testpq>g++ -c -W -Wall a.cpp
d:\alqualos\pr\testpq>g++ -c -W -Wall main.cpp
d:\alqualos\pr\testpq>g++ -o a a.o b.o main.o
d:\alqualos\pr\testpq>a
B::B()
A::A()
B does something!
A::~A()
B::~B()

Does that solve your problem or am I doing something completely different?

EDIT 1: Is it standard or not?

Okay, it seems it was the right thing, but now it leads us to other interesting questions. Here is the result of our discussion in the comments below.

What happens in the example above? The a.h file doesn't need b.h file because it doesn't actually do anything with b, it just declares it, and it knows its size because the pointer in the AutoPtr class is always the same size. The only parts of autoptr.h that need the definition of B are constructor and destructor but they aren't used in a.h so a.h doesn't need to include b.h.

But why exactly a.h doesn't use B's constructor? Aren't B's fields initialized whenever we create an instance of A? If so, the compiler may try to inline this code at every instantiation of A, but then it will fail. In the example above, it looks like the B::B() call is put at the beginning of the compiled constructor A::A() in the a.cpp unit, but does the standard require it?

At first it seems that nothing stops the compiler from inlining fields initialization code whenever an instant is created, so A a; turns into this pseudocode (not real C++ of course):

A a;
a.b->B();
a.A();

Could such compilers exist according to the standard? The answer is no, they couldn't and the standard has nothing to do with it. When the compiler compiles "main.cpp" unit, it has no idea what A::A() constructor does. It could be calling some special constructor for b, so inlining the default one before it would make b initialized twice with different constructors! And the compiler has no way to check for it since the "a.cpp" unit where A::A() is defined is compiled separately.

Okay, now you may think, what if a smart compiler wants to look at B's definition and if there is no other constructor than the default one, then it would put no B::B() call in the A::A() constructor and inline it instead whenever the A::A() is called. Well, that's not going to happen either because the compiler has no way to guarantee that even if B doesn't have any other constructors right now, it won't have any in the future. Suppose we add this to b.h in the B class definition:

B(int b);

Then we put its definition in b.cpp and modify a.cpp accordingly:

A::A():
  b(17) // magic number
{
  printf("A::A()\n");
}

Now when we recompile a.cpp and b.cpp, it will work as expected even if we don't recompile main.cpp. That's called binary compatibility and the compiler shouldn't break that. But if it inlined the B::B() call, we end up with main.cpp that calls two B constructors. But since adding constructors and non-virtual methods should never break binary compatibility, any reasonable compiler shouldn't be allowed to do that.

The last reason for such compilers to not exist is that it doesn't actually make any sense. Even if the members initialization is inlined, it would just increase the code size and will give absolutely no performance increase since there still would be one method call for A::A() so why not to let this method do all the work in one place?

EDIT 2: Okay, what about inline and auto-generated constructors of A?

Another question that arises is what will happen if we remove A:A() from both a.h and a.cpp? Here's what happens:

d:\alqualos\pr\testpq>g++ -c -W -Wall a.cpp
d:\alqualos\pr\testpq>g++ -c -W -Wall main.cpp
In file included from a.h:4:0,
                 from main.cpp:1:
autoptr.h: In constructor 'AutoPtr<T>::AutoPtr() [with T = B]':
a.h:8:9:   instantiated from here
autoptr.h:8:16: error: invalid use of incomplete type 'struct B'
a.h:6:7: error: forward declaration of 'struct B'
autoptr.h: In destructor 'AutoPtr<T>::~AutoPtr() [with T = B]':
a.h:8:9:   instantiated from here
autoptr.h:9:17: warning: possible problem detected in invocation of delete 
operator:
autoptr.h:9:17: warning: invalid use of incomplete type 'struct B'
a.h:6:7: warning: forward declaration of 'struct B'
autoptr.h:9:17: note: neither the destructor nor the class-specific operator 
delete will be called, even if they are declared when the class is defined.

The only error message that is relevant is "invalid use of incomplete type 'struct B'". Basically it means that main.cpp now needs to include b.h, but why? Because the auto-generated constructor is inlined when we instantiate a in main.cpp. Okay, but does this always have to happen or does it depends on the compiler? The answer is that it can't depend on the compiler. No compiler can make an auto-generated constructor non-inline. The reason for that is that it doesn't know where to put its code. From the programmer's point of view the answer is obvious: the constructor should go in the unit where all other methods of the class are defined, but the compiler doesn't know which unit is that. And besides, class methods could be spread across several units and sometimes it even makes sense (like if a part of the class is auto-generated by some tool).

And of course, if we make A::A() explicitly inline either by using the inline keyword or by putting its definition inside the A class declaration, the same compilation error would occur, possibly a bit less cryptic.

The conclusion

It seems it's perfectly fine to employ the technique described above for auto-instantiated pointers. The only thing I'm not sure of is that AutoPtr<B> b; thing inside a.h will work with any compiler. I mean, we can use a forward-delcared class when declaring pointers and references, but is it always correct to use it as a template instantiation parameter? I think there's nothing wrong with that, but compilers may think otherwise. Googling didn't yield any useful results on it either.

Sergei Tachenov
  • 24,345
  • 8
  • 57
  • 73
  • The catch here is that you can't make any constructors or destructor of `class Neg` inline. But there's no way around that if you want to avoid #including b.h. – aschepler Dec 15 '10 at 18:33
  • @Sergey Tachenov: You are actually solving a different problem. In my case (see the UPDATE section at the top) the class `B` is used only internally inside `A`, and I want to avoid inclusion of `B`'s header outside of A.cpp (while you do include B.h in your Neg.cpp). – Roman L Dec 15 '10 at 18:36
  • Now I'm really confused. You wrote that you want to avoid including "b.h" whenever one needs to use "a.h". But "neg.cpp" is the implementation of Neg so of course it needs "b.h". If you're going to use Neg in any other place, you don't need to include "b.h". I'll edit my answer to clarify that. Then please correct me if I got it wrong again. – Sergei Tachenov Dec 15 '10 at 18:37
  • Indeed, I was not completely correct, but see what happens when you create C.cpp and include your Neg.h there (it would be simpler for me if you kept the original `A` name though, not renaming it to `Neg`...) – Roman L Dec 15 '10 at 18:40
  • Well, I just took some test program that worked with negation and modified it, hence the name Neg. But you're right, that's confusing, I'll fix that. – Sergei Tachenov Dec 15 '10 at 18:44
  • Okay, here you go. "main.cpp" instead of "C.cpp", but this time it's not important I guess. – Sergei Tachenov Dec 15 '10 at 18:53
  • Wow, great. I wouldn't even imagine that this could work... But it seems I finally understood *why* it works. If you remove A() constructor or define it inline in the header file, the magic won't work anymore (compliation error). I guess that when you define an explicit non-inline constructor, the construction of members (even *implicit* ones) is forced to be placed in the same compilation unit (A.cpp) where the type (B) is complete. Well, this is logical, but now I'm curious - is this behavior forced by the C++ standard? – Roman L Dec 15 '10 at 19:41
  • Well, as far as I understand it, implicit initialization is just a piece of code that added at the beginning of every constructor, implicit or explicit. And when the constructor is in some unit, that code ends up there too. I am not 100% sure whether the standard doesn't allow compiler to inline that code, though. Maybe it's even worth a new question. – Sergei Tachenov Dec 15 '10 at 19:58
  • I think I got it! The standard doesn't need to say anything. It is impossible for a compiler to inline implicit initialization of members because when it compiles the "main.cpp" unit it doesn't know what the constructor of the A class really does (as the "a.cpp" unit was compiled separately). Maybe `A::A()` calls some special constructor for the field b. In this case inlining a default constructor call would be an error since the object would be initialized twice. So a compiler can only inline auto-generated default constructors. – Sergei Tachenov Dec 15 '10 at 20:06
  • Looks correct to me, except for the conclusion sentence - it seems to me that it won't be a problem for the compiler to inline a constructor explicitly defined in some header. But in the case when an explicit constructor of a class is implemented inside a certain compilation unit, there is no way the compiler could inline construction of its members outside of that unit, no matter what kind they are (explicit/inline etc). – Roman L Dec 15 '10 at 20:39
  • I wasn't talking about constructors in headers. Any method defined inside a class declaration is always inlined even if you don't use the inline keyword. But that is not an optimization so I didn't mention this case. And when you put a method definition inside a header but not inside a class definition and without using the inline keyword, it's just plain wrong so I wasn't talking about this case either. I was comparing explicitly defined non-default constructors with auto-generated ones. I also think auto-generated ones are always inlined because the compiler doesn't know where to put it. – Sergei Tachenov Dec 16 '10 at 05:08
  • I'm completely agree with that, but I still wouldn't agree with your phrase "So a compiler can only inline auto-generated default constructors", because I do not see how you make difference between auto-generated default constructors and manually-written *inline* ones (those are explicit too, aren't they?). So for me it is rather not that a compiler *can only* inline some constructors as you say, but that it *cannot* inline *any member constructors* in the case of a "non-inline" (static?) constructor. – Roman L Dec 16 '10 at 14:08
  • And thanks for the extended update of your answer, it could be quite interesting for newcomers. – Roman L Dec 16 '10 at 14:10
  • "I still wouldn't agree with your phrase" - by "can only inline" I meant "can inline as a part of optimization, out of its own 'will'". In case of manually-written inline constructors not only it "can" inline them, it must do so. It's just my non-native English made me phrase it in a confusing way. – Sergei Tachenov Dec 16 '10 at 15:30
  • Actually, I think it is just that the terminology is confusing (we thus have to distinguish *inline* and *force inline* and it is still unclear...). But still, isn't the compiler forced to inline default constructors as well? Where could it put them non-inlined otherwise? I just don't get, what is the difference that you are trying to indicate, between the inline constructors that one adds himself and the default ones auto-generated by the compiler? – Roman L Dec 16 '10 at 17:40
  • The only difference is that the case with written inline constructor is obvious, and the one with auto-generated ones took me some time to figure out. There is no deep meaning in it, really. What you are saying is completely correct. It is me who got confused by the whole thing at that point. – Sergei Tachenov Dec 16 '10 at 18:10
1

I'm pretty sure this can be implemented in the same way as unique_ptr is implemented. The difference would be that the allocated_unique_ptr constructor would allocate the B object by default.

Note however that if you want automatic construction of the B object, it will be instantiated with the default constructor.

Didier Trosset
  • 36,376
  • 13
  • 83
  • 122
  • I don't get it. This `allocated_unique_ptr` will have to create the object at some point, where would I put this code? I'm fine with the default constructor. – Roman L Dec 15 '10 at 16:10
  • This will still either 1) be templated in B, so at the time of construction need B.h or 2) be specialized for B, thus allocated_unique_ptr.h would need B.h as far as I can see. – rubenvb Dec 15 '10 at 17:06
0

Well, you gave the best solution yourself, use pointers, and new them in the constructor... If there are more than one constructor, repeat that code there. You could make a base class which does that for you, but that would only mystify the implementation...

Have you thought about a template in class B? This can also solve your header cross-dependencies, but will ost likely increase your compile time... Which brings us to the reason you are trying to avoid these #includes. Have you measured compile time? Is it troubling? Is this the problem?

UPDATE: example for template way:

// A.h
template<class T>
class A
{
public:
    A(): p_t( new T ) {}
    virtual ~A() { delete p_t }
private:
    T* p_t;
};

Again, this will most likely not increase compile time (B.h will need to be pulled in to create the template instance A<B>), it does hower allow you to remove the includes in the A header and source file.

rubenvb
  • 74,642
  • 33
  • 187
  • 332
  • Yes, compile time is troubling. What do you mean by template in class B? – Roman L Dec 15 '10 at 15:38
  • If I got your idea correctly, when I include A.h I will also have to include B.h, bringing me back to my original problem. What I'm trying to do is to stop "propagation" of B.h include to A.h includers. – Roman L Dec 15 '10 at 15:56
  • Well, you can't create a class with some other class inside it, without knowing both classes... This template explicitly moves the dependency to where A is used, so out of the A.h header/source. More you can't possibly do. I see no reason (unless there's one you haven't given me) why B.h should include A.h? – rubenvb Dec 15 '10 at 15:59
  • I can have class A with a pointer to B inside, without having to include B.h. There is no inclusion of A.h from B.h, probably I've made myself unclear at some point. – Roman L Dec 15 '10 at 16:07
  • @7vies: well, then just use pointers... It really depends on if there is one A for many types of B, or if there are many A's for a limited number of B's etc... My template shifts the include to another file, a pointer is the obvious solution, but that's not what you want (the reason for this depends on the situation of course). – rubenvb Dec 15 '10 at 16:22
  • I like the pointer solution, except that I don't like doing stupid things (such as `pB = new B;` in constructor and `delete pB;` in destructor) manually if they can be done automatically. Unfortunately your template does not help in my case, as B is typically used only inside A so there is no need to include B.h everywhere A is used. Thanks for your ideas in any case. – Roman L Dec 15 '10 at 16:31
0

You could write an automatic pimpl_ptr<T> which would automatically new, delete, and copy the contained T.

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • At which point would I have the instantiation? It cannot be at pimpl_ptr creation, as the type is incomplete. – Roman L Dec 15 '10 at 15:40
  • @7vies, you'd need to know the full type of `T` at any construction, destruction or call to `pimpl_ptr`. So, this means no compiler-generated constructors, destructors or asignments, for starters, unless they're in a context where `T` is fully known. – Nathan Ernst Dec 15 '10 at 15:53
  • You could explicitly export pimpl_ptr from B.cpp. – Puppy Dec 15 '10 at 17:41
0

A problem with your approach is that while you could avoid including the header file for B in this way, it doesn't really reduce dependencies.

A better way to reduce dependencies is by letting B derive from a base class declared in a separate header file, and use a pointer to that base class in A. You'd still need to manually create the correct descendant (B) in A's constructor, of course.

It is also very possible that the dependency between A and B is real and in that case you're improving nothing by artificially avoiding to include B's header file.

Frederik Slijkerman
  • 6,471
  • 28
  • 39
  • Well, in my case the dependency is real, and (for the moment) I am not trying to reduce it, but what I want to avoid is chained header inclusion. – Roman L Dec 15 '10 at 15:45