1

I'm doing inheritance ( i.e. calling super class's function with subclass data type) with C, but encountered the aliasing issue.

In below, shape function is called with rectangle object.

In the case of me->super, the x and y are correct. However, they're wrong in that of (Shape*)me.

The reason I prefer (Shape*)me over me->super is that I want to hide struct implementation from clients.

Shouldn't C standard guarantee that? As per Section 6.7.2.1.13

“… A pointer to a structure object, suitably converted, points to its initial member. There may be unnamed padding within a structure object, but not at its beginning”.

/*shape.h*/
#ifndef SHAPE_H
#define SHAPE_H

typedef struct Shape Shape;

Shape* Shape_ctor(int x, int y);
int Shape_getX(Shape* me);
int Shape_getY(Shape* me);

#endif

/*shape.c*/
#include <stdlib.h>
#include "shape.h"

struct Shape
{
    int x;
    int y;
};

Shape* Shape_ctor(int x, int y)
{
    Shape* me = malloc(sizeof(struct Shape));
    me->x = x;
    me->y = y;

    return me;
}

int Shape_getX(Shape* me)
{
    return me->x;
}

int Shape_getY(Shape* me)
{
    return me->y;
}

/*rectangle.h*/
#ifndef RECT_H
#define RECT_H

#include "shape.h"

typedef struct Rectangle Rectangle;


Rectangle* Rectangle_ctor(int x, int y, unsigned int width, unsigned int height);

int Rectangle_getWidth(Rectangle* me);
int Rectangle_getHeight(Rectangle* me);

#endif

/*rectangle.c*/
#include <stdlib.h>
#include "rectangle.h"
#include "stdio.h"

struct Rectangle
{
    Shape* super;
    unsigned int width;
    unsigned int height;
};

Rectangle* Rectangle_ctor(int x, int y, unsigned int width, unsigned int height)
{
    Rectangle* me = malloc(sizeof(struct Rectangle));
    me->super = Shape_ctor(x, y);
    me->width = width;
    me->height = height;

    printf("x: %d\n", Shape_getX(me->super)); //correct value
    printf("y: %d\n", Shape_getY(me->super)); //correct value

    printf("x: %d\n", Shape_getX((Shape*)me)); // wrong value
    printf("y: %d\n", Shape_getY((Shape*)me)); // wrong value

    return me;
}

int Rectangle_getWidth(Rectangle* me)
{
    return me->width;
}

int Rectangle_getHeight(Rectangle* me)
{
    return me->height;
}

/*main.c*/
#include <stdio.h>
#include "rectangle.h"

int main(void) {

  Rectangle* r1 = Rectangle_ctor(0, 2, 10, 15);
  printf("r1: (x=%d, y=%d, width=%d, height=%d)\n", Shape_getX((Shape*)r1)
                                                  , Shape_getY((Shape*)r1)
                                                  , Rectangle_getWidth(r1)
                                                  , Rectangle_getHeight(r1));

  return 0;
}
  • 1
    You can use a pointer to a struct object as a pointer to the first member, but the first member is in your case a pointer, so you have a pointer to a pointer, e.g., `*(Shape**)me` and `me->super` are equal, when `super` is the first member. The important question is: Why do you want to not use `me->super`? – mch Oct 27 '21 at 09:41
  • `Shape* super;` must be `Shape super;` or none of this makes much sense. – Lundin Oct 27 '21 at 10:31
  • Btw where is rectangle.h? – Lundin Oct 27 '21 at 10:46
  • @Lundin The rest of source code is added – Mr.nerd3345678 Oct 28 '21 at 02:51

3 Answers3

2

You should place a base type as a first member. Not a pointer to the base type.

struct Rectangle {
    Shape super;
    ...
}

Moreover, you should redesign Shape_ctor. I suggest taking a pointer as a parameter and delegate the memory management to the caller.


Shape* Shape_ctor(Shape *me, int x, int y)
{
    me->x = x;
    me->y = y;

    return me;
}

The constructor of rectangle would be:

Rectangle* Rectangle_ctor(Rectangle *me, int x, int y, unsigned int width, unsigned int height)
{
    Shape_ctor(&me->super, x, y); // call base constructor
    me->width = width;
    me->height = height;

    printf("x: %d\n", Shape_getX(&me->super)); //correct value
    printf("y: %d\n", Shape_getY(&me->super)); //correct value

    return me;
}

Typical usage:

Rectangle rect;
Rectangle_ctor(&rect, ...);

or a bit more exotic variants like:

Rectangle* rect = malloc(sizeof *rect);
Rectangle_ctor(rect, ...);

// or
Rectangle* rect = Rectangle_ctor(malloc(sizeof *rect), ...);

// or even kind of automatic pointer
Rectangle* rect = Rectangle_ctor(&(Rectangle){0}, ...);

The cast would only be needed for implementation of virtual methods like Shape_getArea().

struct Shape {
  ...
  double (*getArea)(struct Shape*);
};

double Shape_getArea(Shape *me) {
  return me->getArea(me);
}
double Rectangle_getArea(Shape *base) {
  Rectangle *me = (Rectangle*)base; // the only cast
  return (double)me->width * me->height;
}

Rectangle* Rectangle_ctor(Rectangle *me, int x, int y, unsigned int width, unsigned int height) {
  ...
  me->super.getArea = Rectangle_getArea;
  ...
}

// usage:
Rectangle rect;
Rectangle_ctor(&rect, 0, 0, 3, 2);

Shape *shape = &rect.super;

Shape_getArea(shape); // should return 6

EDIT

In order to hide internals of Shape place a pointer to its private data in the structure. Initialize this pointer with relevant data in Shape_ctor.

struct Shape {
  void *private_data;
  // non private fields
};
tstanisl
  • 13,520
  • 2
  • 25
  • 40
  • I don't want to let client know details about structs, so I didn't put them in header file. That's why malloc is done through supplier, not client. Why do you suggest delegating to the caller? Do I get more benefit from it than the encapsulation gives? – Mr.nerd3345678 Oct 27 '21 at 10:16
  • @Mr.nerd3345678, 1) If you want to hide detail of `Shape` please declare it as `struct Shape { void *private; };`. 2) it's generally better to let the caller handle the memory. It gives more opportunities for optimization. Note that C++ has a feature known as "placement new" letting the caller of the constructor provide a pointer to the created object. – tstanisl Oct 27 '21 at 10:26
  • @Mr.nerd3345678 Don't listen to that part of the answer, your code with malloc is correct, we can unfortunately not delegate allocation to the caller in case of opaque types. – Lundin Oct 27 '21 at 10:33
  • @Lundin, yes, but opaque types tends to result if a lot of indirection and bad performance. Moreover term **caller** refers to the library itself as well – tstanisl Oct 27 '21 at 10:34
  • @Lundin I take encapsulation as first priority, so I really do not want to put struct details to header file. Could you please tell me how to achieve single inheritance in this very case without putting struct details to header file? – Mr.nerd3345678 Oct 27 '21 at 10:36
  • Program design is always a balance between readability/maintainability and performance. In some 99% of all applications it doesn't matter. I even use opaque types when writing super real-time critical stuff like CAN bus hardware drivers. – Lundin Oct 27 '21 at 10:37
  • @Mr.nerd3345678, to be able to call virtual methods you have to place a pointer to `Rectangle` into `Shape` (in form of `void*`). – tstanisl Oct 27 '21 at 10:38
  • @Mr.nerd3345678 You need to change the first member to `Shape super;` as mentioned in this answer. There's a rule saying that a pointer to a struct can always be converted to a pointer to the type of it's first member and vice versa. That rule gives a free pass through all aliasing rules. – Lundin Oct 27 '21 at 10:39
  • @Mr.nerd3345678 Oh wait, the actual problem here is that you hide the internals of the `Shape` class to the inherited class. That might be the true problem. Should they really be "private" rather than "protected", to use C++ terms? – Lundin Oct 27 '21 at 10:42
  • @Lundin, there is no other way to differentiate between "public" and "protected" in C than a proper documentation. Private fields can be hidden with the method described in the answer – tstanisl Oct 27 '21 at 10:47
  • Sure there is. If you keep the base class opaque, the members are truly private. But you can expose some of them in a different header only available to the inherited class. A bit clunky, sure, but perfectly possible. – Lundin Oct 27 '21 at 10:55
  • @Lundin, I've said "protected vs public", not "protected vs private". To distinguish which headers should be used for given purpose you need to **document** it properly. – tstanisl Oct 27 '21 at 10:57
  • Well yes, obviously... and that goes for all header files, not just OOP-related ones. Although in this case such a "protected" header would only contain a struct definition which is useless to the caller, even if they would include that header by mistake. Because they still can't access that part of the opaque base struct. – Lundin Oct 27 '21 at 11:02
1

The initial member of Rectangle is a Shape*, not a Shape. And the initial member of Shape is an int, not a Shape*`. The commonality you assume just isn't there.

If you look at how C++ implementations implement inheritance, in case of single inheritance they will place the base subobject at offset 0 inside the full object. Pointers to base class subobjects are used for virtual inheritance, but that rapidly gets complex.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Can you give me a direction to having the same initial member in both `Rectangle` and `Shape` without putting struct into header file? – Mr.nerd3345678 Oct 27 '21 at 10:19
  • Specific for single inheritance is okay. – Mr.nerd3345678 Oct 27 '21 at 10:31
  • @Mr.nerd3345678: No, and it's no coincidence that C++ requires a complete definition for base classes. There's a lot of background in "The Design and Evolution of C++", in which Stroustrup describes how he added OO to C in order to create C++. – MSalters Oct 27 '21 at 11:55
-1

When C99 was written, there were conflicts between committee members who refused to have the Standard characterize as illegitimate some useful constructs that exploit the Common Initial Sequence guarantees, and those who refused to have it characterized as illegitimate optimizations that would break code that relied upon such constructs, but would be useful for code that didn't.

Such conflicts were resolved by writing ambiguous rules which both sides could interpret as saying what they wanted them to say. Since there was never any consensus about what the rules should mean, questions about what the rules really mean are inherently unanswerable.

Even going back to C89, the only way the rules really make sense is if one interprets the phrase "by an lvalue of a particular type" as applying to dereferenced pointers which are freshly visibly derived from something of appropriate type. Otherwise, given something like:

struct s { int x[2]; } foo;

an access to foo.x[1] would violate the type aliasing rules, since it is defined as equivalent to *(foo.x+1). The inner expression (foo.x+1) is a pointer of type int*, which has no relation to type struct s, and int is not among the types that may be used to access an object of type struct s. Any decent compiler should obviously recognize that the pointer is freshly visibly derived from an object of type struct s, however, and treat the access as though it were performed via that type. The question of when a pointer is "freshly-visibly derived" is a quality of implementation issue outside the Standard's jurisdiction, with the expectation that compilers would make a reasonable effort to notice such things whether or not the Standard mandated that they do so. Most conflicts revolve around the fact that some compiler maintainers that aren't interested in selling compilers are deliberately blind to any forms of derivation beyond those needed to avoid making structures completely useless.

So far as I can tell, all compilers not based upon clang and gcc will support common constructs exploiting the Common Initial Sequence rules, even when type-based aliasing analysis is enabled. Further, both clang nor gcc will sometimes perform erroneous "optimizations" even on some strictly conforming programs when their type-based aliasing analysis is enabled. As such, rather than trying to jump through hoops to be compatible with the broken optimization modes of clang and gcc, I'd recommend simply documenting that code requires that implemnentations, as a form of "conforming language extension", make a reasonable effort to process constructs involving Common Initial Sequence guarantees usefully.

supercat
  • 77,689
  • 9
  • 166
  • 211