0

Imagine that you have such structs:

struct nix_codec {
    nix_uint8 state;
    nix_uint8 mode;
    nix_uint8 flags;
    nix_size offset;
    nix_uint32 codepage;
    nix_utf8 const *const *aliases;
    void (*delete)(
        struct nix_codec *codec,
        struct nix_error *error
    );
    struct nix_codec* (*clone)(
        struct nix_codec const *codec,
        nix_int8 mode,
        struct nix_error *error
    );
    nix_size (*decode)(
        struct nix_codec *codec,
        nix_byte const *bdata,
        nix_size bsize,
        nix_rune *udata,
        nix_size usize,
        struct nix_error *error
    );
    nix_size (*encode)(
        struct nix_codec *codec,
        nix_rune const *udata,
        nix_size usize,
        nix_byte *bdata,
        nix_size bsize,
        struct nix_error *error
    );
};

typedef struct {
    nix_uint8 const state;
    nix_uint8 const mode;
    nix_uint8 const flags;
    nix_size const offset;
    nix_uint32 const codepage;
    nix_utf8 const *const *const aliases;
} nix_codec;

One has also several functions, which are used to create the nix_codec* instances, e.g. for UTF-8 codec it will look like this:

static nix_size self_decode
(
    struct nix_codec *codec,
    nix_byte const *bdata,
    nix_size bsize,
    nix_rune *udata,
    nix_size usize,
    struct nix_error *error
)
{ /* UTF-8 decode function, too long to post here */}

static nix_utf8 const *const aliases[] = {
    "UTF-8",
    "UTF8",
    "CP65001",
    NULL,
};

nix_codec *nix_codec_utf8
(
    nix_int8 mode,
    struct nix_error *error
)
{
    struct nix_codec *codec = NULL;

    if ((mode != NIX_CODEC_STRICT) && (mode != NIX_CODEC_ESCAPE)
    &&  (mode != NIX_CODEC_REPLACE) && (mode != NIX_CODEC_IGNORE)) {
        return NULL;
    }
    codec = calloc(1, sizeof(struct nix_codec));
    if (codec == NULL) {
        return NULL;
    }
    codec->mode = mode;
    codec->codepage = 65001;
    codec->aliases = aliases;
    codec->decode = &self_decode;
    codec->encode = &self_encode;
    codec->flags = (NIX_CODEC_COMPATIBLE | NIX_CODEC_MULTIBYTE | NIX_CODEC_ABSOLUTE);
    return (nix_codec*)codec;
}

The function for legacy single-byte encodings is based on such structures:

struct nix_sbmap {
    nix_uint8 byte;
    nix_rune rune;
};

struct nix_sbcodec {
    struct nix_codec base;
    struct nix_sbmap const *entries;
    nix_size count;
};

Note that struct nix_sbcodec and struct nix_sbmap are declared in the source files, not in headers, thus there is no need to use variant pattern. The corresponding function, e.g. nix_codec_koi8r(), allocates a struct nix_sbcodec, sets its base, entries and count members and then casts it to nix_codec and returns it. Every actual encode() and decode() calls are performed using this public function:

nix_size nix_codec_decode
(
    nix_codec *codec,
    nix_byte const *bdata,
    nix_size bsize,
    nix_rune *udata,
    nix_size usize,
    struct nix_error *error
)
{
    nix_size result = 0;
    struct nix_codec *self = (struct nix_codec*)codec;

    return self->decode(self, bdata, bsize, udata, usize, error);
}

Note that state, mode, flags and offset members may be interesting to anyone using any codec (the most part of them is set in codec creator functions, offset is changed after calls to encode() and decode() functions and represents the count of bytes/Unicode characters which were successfully processed before function exited. Each codec has its own encode() and decode() functions as you see.

Now the question: is this trick correct and guaranteed to work by the C Standard?

Thanks in advance!

ghostmansd
  • 3,285
  • 5
  • 30
  • 44
  • This is C not C++ right? Why are you doing this? This is frankly an accident waiting to happen. What do you expect the compiler to do if you assign one `my_object` to another? (Hint: `memcpy(b, a, sizeof(my_object)`) – Dirk Koopman Dec 31 '14 at 18:58
  • That's the reason why I put "class" in quotes: there is no such thing in C, but it can be simulated. – ghostmansd Dec 31 '14 at 19:00
  • I don't think it's an accident: at least `FILE*` seems to follow the same idea. – ghostmansd Dec 31 '14 at 19:02
  • `FILE *` does this to provide an opaque type. It is to force users to use accessor functions everywhere (e.g. `fileno()`). Historically programmers relied on the `FILE` structure to be a certain way - accessing it directly - and, when things in it changed, those programs broke. You are attempting a hybrid. It will not work reliably as is. – Dirk Koopman Dec 31 '14 at 19:19
  • try this: `printf("%d %d\n", sizeof(my_object), sizeof(struct my_object));` in a little test program. – Dirk Koopman Dec 31 '14 at 19:43
  • @DirkKoopman: the size will be different, since `struct my_object` contains a pointer to function; however, IIRC C doesn't guarantee that `my_object` and `struct my_object` would be placed in memory the same way if they would have been the same except that the latter had `const` members. – ghostmansd Dec 31 '14 at 19:50
  • `internal_func` same as `func`? – chux - Reinstate Monica Dec 31 '14 at 20:57
  • No this will not work. `((struct my_object*)obj)->func(i)` or `((struct my_object*)obj)->internal_func(i)` are not even valid function calls. – chux - Reinstate Monica Dec 31 '14 at 21:00
  • @chux: there was some typos in my code (I've typed from phone). I've fixed them. `((struct my_object*)obj)->func((struct my_object*)obj, i)`,which is wrapped in `my_object_func()`, is what I mean. – ghostmansd Jan 01 '15 at 09:06
  • See more clearly the goal. (BTW, using `my_object` in 2 name spaces confused your goal for me.) But you are doing something else other than hiding a function - (hiding may be OK), code is using `const` or not on a field. See [recent post](http://stackoverflow.com/a/27724745/2410359). This may be a non-no. – chux - Reinstate Monica Jan 01 '15 at 09:21
  • @chux: I've seen this post before; however, this situation is not the same, since I'm neither trying to modify variable in the same scope nor really trying to modify a read-only memory (the memory in `x` and `y` fields in not really read-only; it just appears after casting these members to const). – ghostmansd Jan 01 '15 at 10:38
  • Hi guys, I've updated the post providing the more actual structure and tried to explain what I'm trying to achieve. – ghostmansd Jan 01 '15 at 11:04

2 Answers2

0

A reliable way of dealing with variant types in C, such as what I think you are attempting, is to use a union. For example:

typedef struct {
    uint8_t x;
    uint16_t y;
} obj_a;

typedef struct {
    char *p;
    char buf[42];
} obj_b;

typedef struct obj_base_s {
    int (*internal_func)(struct obj_base_s *, int);
    union {
         obj_a a;
         obj_b b;
    } u;
} obj_base;

All the creator / destructor functions will return or use an obj_base. Functions that use members in the union can either access their bits directly else do something like this:

void handle_obja(obj_base *bp)
{
     obj_a *oap = &bp->u.a;
     oap->x = 23;
     oap->y = 19;
     ...
}

C is not C++, if you want nice classes, inheritance, overloading and all that stuff, then use C++. That is the main reason why C++ was invented. C does not do "classes". C is a much lower level language.

Dirk Koopman
  • 131
  • 5
  • `Variant` pattern doesn't apply here. I'm talking about inheritance, but there may be a lot of structs that inherit `my_object`, and I don't see any benefits of putting every `variant` in the `my_object` declaration. – ghostmansd Jan 01 '15 at 10:42
  • See the updated post. Hope that it will clarify things. – ghostmansd Jan 01 '15 at 11:05
  • C does not have any concept of inheritance. Ultimately, to mimic C++ inheritance, you either have a big structure which you, the programmer, divide up in chunks that are relevant to different sets of functions. Or you use different structures and (one way or another) pass void pointers to those structures. But mimicking C++ is what you are doing. If you can find a working copy of `cfront` you might be interested to its output. This was how C++ was originally preprocessed in C before compilation. IIRC it just created one big structure per set of classes. – Dirk Koopman Jan 01 '15 at 16:35
  • Using C++ for my project is not really a good idea, since I'm implementing a C library (not C/C++). Trying to imitate C++ classes is not that hard; there is a good C book (OOC), which shows the way that one can achieve such goal. I'm not the biggest fun of this book; frankly I've started my project long before I found this book. I'm not going to use all the features which C++ has; the features I really need are not so hard to do using ANSI C. Don't get me wrong, I really like C++, it is amazing language, but it is an overkill for my project. Anyway, thank you for your help! – ghostmansd Jan 01 '15 at 17:17
  • The most common way to use inheritance in C is to put the base structure as the first member of the inheritor struct; C standard IIRC says that pointer to the struct points to the same memory as the first member of this structure, so it seems it is legal to do such casts. I want to know if it is still legal if members of the structure differ by constness. – ghostmansd Jan 01 '15 at 17:22
  • It is legal, but pointless, because if there is any non-const fields then the struct will be put in normal variable storage (if declared). Whatever you `malloc` will always end up on the heap, but you won't then be able to assign values to any `const` items declared in the `malloc`ed structure. – Dirk Koopman Jan 01 '15 at 17:36
0

Note that state, mode, flags and offset members may be interesting to anyone using any codec (…

Now the question: is this trick correct and guaranteed to work by the C Standard?

However useful it may be and however often it may work, this trick (accessing members of a struct nix_codec object by a pointer to a differently defined nix_codec object) is unsafe unless you assert that the corresponding members of both structure types have respectively the same offset in the structures; this is not guaranteed for the independently defined structure types; such guarantee exists only within a union of the structures - see section Language / Expressions / Postfix operators / Structure and union members / paragraph 6:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

Armali
  • 18,255
  • 14
  • 57
  • 171