1

I am developing a program using a third-party UI library with functions in the form Vbox(void *first, ...). These serve as layout functions and take an arbitrary number of parameters. The end of the list is defined by the first NULL detected. This means that I need to remember to end my list with a NULL, something I often fail to do.

So I created a few auxiliary macros which should expand to append my list with a NULL.

These are of the form:

#define UtlVbox(first, ...) Vbox(first, ##__VA_ARGS__, NULL)

The ## before the __VA_ARGS__ serve to get rid of the previous comma in case __VA_ARGS is empty.

I need the first in case the box should actually be initialized empty (Vbox(NULL)): in these cases, the user must explicitly add the NULL because I can't get rid of the , after the __VA_ARGS__(since the ## hack only works if the comma is before the ##, not after), so an explicit NULL must be given by the user, which will result in the following expansion: Vbox(NULL, NULL), which is a bit redundant but fine.

This works well overall, but I've bumped into an odd situation I can't quite understand.

Take the following file, for example:

// expand.c
void*  Vbox(void* first, ...);
void*  Hbox(void* first, ...);

#define UtlVbox(first, ...) Vbox(first, ##__VA_ARGS__, NULL)
#define UtlHbox(first, ...) Hbox(first, ##__VA_ARGS__, NULL)

static void* Test()
{
    return UtlHbox(
        Foo,
        UtlVbox(
            UtlHbox(Bar)));
}

If I run gcc -E expand.c, I get the following output:

# 1 "expand.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "expand.c"
void* Vbox(void* first, ...);
void* Hbox(void* first, ...);

static void* Test()
{
    return Hbox(Foo, Vbox(UtlHbox(Bar), NULL), NULL);
}

Everything is expanded precisely as expected, except for the innermost UtlHbox, which for some reason hasn't been expanded and therefore throws an error on compilation. (Also, the NULL's weren't expanded in this example since there aren't any relevant #include's). In VC12 (Visual Studio 2013), things compile just fine.

What's happening here? Is this a conflict between the different meanings of the ## operation? Is there any way to solve this?


I'm using GCC 4.6.3, but I've tried compiling this on GodBolt with GCC 7.1 and get the same results.

After some research, I'm starting to think I've bumped into a known problem in GCC.

It seems like GCC is tripping up on itself. If I create a third macro

#define UtlZbox(first, ...) Zbox(first , ##__VA_ARGS__, NULL)

and replace the inner UtlHbox in the example above with this new macro, the output is correct:

static void* Test()
{
    return Hbox(Foo, Vbox(Zbox(Bar, NULL), NULL), NULL);
}

It seems like GCC trips over itself when a variadic macro is repeated within another instance of itself.

I've done a few other tests (modifying the macros to ease visualization):

#define UtlVbox(first, ...) V(first,##__VA_ARGS__)
#define UtlHbox(first, ...) H(first,##__VA_ARGS__)

int main()
{
    // HHH
    UtlHbox(UtlHbox(UtlHbox(1)));
    UtlHbox(UtlHbox(UtlHbox(2, 1)));
    UtlHbox(UtlHbox(2, UtlHbox(1)));
    UtlHbox(2, UtlHbox(UtlHbox(1)));
    UtlHbox(3, UtlHbox(2, UtlHbox(1)));
    // HHV
    UtlHbox(UtlHbox(UtlVbox(1)));
    UtlHbox(UtlHbox(UtlVbox(2, 1)));
    UtlHbox(UtlHbox(2, UtlVbox(1)));
    UtlHbox(2, UtlHbox(UtlVbox(1)));
    UtlHbox(3, UtlHbox(2, UtlVbox(1)));
    // HVH
    UtlHbox(UtlVbox(UtlHbox(1)));
    UtlHbox(UtlVbox(UtlHbox(2, 1)));
    UtlHbox(UtlVbox(2, UtlHbox(1)));
    UtlHbox(2, UtlVbox(UtlHbox(1)));
    UtlHbox(3, UtlVbox(2, UtlHbox(1)));
    // VHH
    UtlVbox(UtlHbox(UtlHbox(1)));
    UtlVbox(UtlHbox(UtlHbox(2, 1)));
    UtlVbox(UtlHbox(2, UtlHbox(1)));
    UtlVbox(2, UtlHbox(UtlHbox(1)));
    UtlVbox(3, UtlHbox(2, UtlHbox(1)));

    return 0;
}

Here's Godbolt's output, compiling with GCC 7.1 (doing it on my machine with 4.6.3 gives identical output):

enter image description here

Successful conversions are marked with green arrows, failures are red. The problem seems to be when a macro X with variadic arguments is placed anywhere in the variadic arguments of another instance of X (even if as an argument (variadic or not) of some other macro Y).

The last block of tests (marked as // Failures...) is a repeat of all the previous cases which failed, only replacing whichever macro failed to expand with a UtlZbox. Doing this caused proper expansion in every single case except the one where a UtlZbox is placed in the variadic argument of another UtlZbox.

Wasabi
  • 2,879
  • 3
  • 26
  • 48
  • Looks like a not so good idea. Why don't you just pass an array with the list to your functions? That would safe a lot of run-time overhead for building internal structures, passing the parameterlist, etc. – too honest for this site May 25 '17 at 15:15
  • 1
    @Olaf: Do you mean the fact that the original functions are of the form `Vbox(first, ...)`? Unfortunately, the library isn't mine and its use is non-negotiable. – Wasabi May 25 '17 at 15:18
  • Yes, that's what I mean. And this intrface does not shed a good light on the library. – too honest for this site May 25 '17 at 15:23
  • The solution is not to wrap the library, but to ensure you just always call it right... – Alnitak May 25 '17 at 15:25
  • @Alnitak: That's one (and perhaps the only) solution, sure, but do you know why this is happening? – Wasabi May 25 '17 at 15:27
  • no, I can see that it only happens if you put the `##` marker in, but without it you have to put an extra arg in anyway which kinda defeats the point of it all... – Alnitak May 25 '17 at 15:51
  • Mostly, I think it's because your inner call to `UtlHBox(Bar)` isn't legal because the variadic template requires both the `first` parameter _and at least one following parameter_ – Alnitak May 25 '17 at 15:52
  • 1
    @Alnitak: But as I understand it, the template shouldn't require another parameter. Isn't that the whole point of the `##`? When prefixing an empty `__VA_ARGS__`, it removes the previous comma, right (see [here](https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html))? In which case `UtlHbox(oneParam)` should simply become `Hbox(oneParam, NULL)` (instead of `(oneParam, , NULL)`. In fact, that's what's happening in the UtlVbox, which has only a single parameter (the UtlHbox), but handles it just fine, stripping away the unnecessary comma and appending a NULL at the end. – Wasabi May 25 '17 at 17:42
  • "The ## before the VA_ARGS serve to get rid of the previous comma " - note, this is a GNU extension – M.M May 25 '17 at 22:14
  • @M.M: Thanks, I am aware. However, it also happens to work in MSVC (even though it isn't necessary; MSVC removes the comma automatically). – Wasabi May 25 '17 at 22:19
  • Perhaps the problem is due to "if any nested replacements encounter the name of the macro being replaced, it is not replaced" – M.M May 25 '17 at 22:28
  • @M.M: That sounds reasonable. Wasn't aware of that. Feel free to write that up as an answer. – Wasabi May 25 '17 at 22:44
  • I don't fully understand it TBH , the section on `##` expansion says that the operand is expanded differently but I get lost quickly trying to mentally model the expansion – M.M May 25 '17 at 22:48

1 Answers1

2

This is not a bug; this is "blue paint".

In VC12 (Visual Studio 2013), things compile just fine.

Just to mention... Visual Studio's preprocessor is non-standard.

I've bumped into an odd situation I can't quite understand.

...and here I can help. First, let's go over the rules for how the preprocessor works in general.

Outline

Expansion of function like macros occurs in a number of steps, which we can call

  1. argument identification
  2. argument substitution
  3. stringification and pasting
  4. rescanning and further replacement

During argument identification, you simply match formal arguments to invoked arguments. For varying argument macros, the standard requires the varying argument itself to have one or more invoked arguments.

...as a gnu extension (which you're using), we can map the varying part to no argument. I'm going to call this null. Note that this is distinct from empty (and placeholder tokens); in particular, if we #define FOO(x,...), then the invocation FOO(z) sets __VA_ARGS__ to null; by contrast, FOO(z,) will set it to empty.

During argument substitution, you apply the replacement list; within the replacement list, you can replace formal arguments with the invoked arguments. Before doing so, any invoked argument that is not being stringified and also not participating in a paste operator (neither the left or right side of paste) is fully expanded.

Stringification and pasting apply next, in any order.

Once the above steps are performed, there's one more final scan during the rescanning and further replacement step. As a special rule, during this scan for a particular macro you're no longer allowed to expand that same macro. The standard jargon for this is "blue paint"; the macro is marked (or "painted blue") for this expansion. Once the entire scan is complete, the macro is "unpainted".

The explanation

Let's take your first example, but I'm going to change it just a tiny bit:

#define UtlVbox(first, ...) Vbox(first, ##__VA_ARGS__, NULL)
#define UtlHbox(first, ...) Hbox(first, ##__VA_ARGS__, NULL)
#define foomacro Foo
UtlHbox(foomacro,UtlVbox(UtlHbox(Bar)))

Here I'm just taking the "C" away to focus on the preprocessor only. I've also changed the invocation to call a macro foomacro to highlight something. Now here's how the UtlHbox invocation expands.

We start with argument identification. UtlHbox has formal arguments first and ...; the invocation has arguments foomacro and UtlVbox(UtlHbox(Bar)). So first is foomacro and __VA_ARGS__ is UtlVbox(UtlHbox(Bar)).

Next we perform argument substitution using the replacement list, which is:

Hbox(first, ##__VA_ARGS__, NULL)

...so we replace first then with foomacro after foomacro has been expanded; and __VA_ARGS__ with UtlVbox(UtlHbox(Bar)) literally. The latter case is different because, in this replacement list, __VA_ARGS__ is a participant (namely, the right hand side) of a paste operator; thus, it does not get expanded. So we get this:

Hbox(Foo, ## UtlVbox(UtlHbox(Bar)))

Next we perform stringification and pasting, getting this:

Hbox(Foo, UtlVbox(UtlHbox(Bar)))

Next we apply rescanning and further replacement for UtlHbox. So we paint UtlHbox blue, then we evaluate that string. You can probably already see yourself getting into trouble here, but for completion I'll keep going.

During rescanning and further replacement, we find UtlVbox, which is the other macro. This yields a second level of evaluation for the macro UtlVbox.

In the second level argument identification, first is UtlHbox(Bar); and __VA_ARGS__ is null.

In the second level of argument substitution, we look at UtlVbox's replacement list, which is:

Vbox(first, ##__VA_ARGS__, NULL)

Since first is not stringified or pasted, we evaluate the invoked argument, UtlHbox(Bar), before substituting it. But since UtlHbox is painted blue, we do not recognize it as a macro. __VA_ARGS__, meanwhile, is null. So we simply get:

Vbox(UtlHbox(Bar), ## null, NULL)

In the second level while pasting, we're pasting a placement token to the right of a comma with null; this triggers the gnu extension for the comma elision rule, so the resulting paste removes the comma, and we get:

Vbox(UtlHbox(Bar), NULL)

In the second level rescanning and replacement, we paint UtlVbox blue, then rescan this piece again. Since UtlHbox is still painted blue, it's still not recognized as a macro. Since nothing else is a macro, the scan completes.

So backing out a level, we wind up with this:

Hbox(Foo, Vbox(UtlHbox(Bar), NULL))

...before proceeding, being done with rescanning and replacement for each, we unpaint UtlVbox and UtlHbox.

Solution

Is there any way to solve this?

Well, note that there are two levels of expansion; one occurs during argument substitution, and the other during rescanning and replacement. The former happens before blue paint applies, and it can recurse indefinitely:

#define BRACIFY(NAME_) { NAME_ }
BRACIFY(BRACIFY(BRACIFY(BRACIFY(BRACIFY(Z)))) BRACIFY(X))

...will happily expand to:

{ { { { { Z } } } } { X } }

This looks like what you want to do. But "argument substitution" evaluation only happens if your arguments aren't stringifying or pasting. So what's really killing you here is the gnu comma elision feature; your use of that involves applying the paste operator to __VA_ARGS__; this disqualifies your varying arguments for expanding during argument substitution. Instead, they only get to expand during rescanning and replacement, and in that stage, your macro is painted blue.

So the solution is to simply avoid comma elision. In your case, that's actually pretty easy. Let's take a closer look:

#define UtlVbox(first, ...) Vbox(first, ##__VA_ARGS__, NULL)
#define UtlHbox(first, ...) Hbox(first, ##__VA_ARGS__, NULL)

So you want UtlVbox(a) to become Vbox(a, NULL), and UtlVbox(a, b) to become Vbox(a, b, NULL). How about just doing this then?

#define UtlVbox(...) Vbox(__VA_ARGS__, NULL)
#define UtlHbox(...) Hbox(__VA_ARGS__, NULL)

Now this:

UtlHbox(UtlHbox(UtlHbox(1)));
UtlHbox(UtlHbox(UtlHbox(2, 1)));
UtlHbox(UtlHbox(2, UtlHbox(1)));
UtlHbox(2, UtlHbox(UtlHbox(1)));
UtlHbox(3, UtlHbox(2, UtlHbox(1)));
UtlHbox(UtlHbox(UtlVbox(1)));
UtlHbox(UtlHbox(UtlVbox(2, 1)));
UtlHbox(UtlHbox(2, UtlVbox(1)));
UtlHbox(2, UtlHbox(UtlVbox(1)));
UtlHbox(3, UtlHbox(2, UtlVbox(1)));
UtlHbox(UtlVbox(UtlHbox(1)));
UtlHbox(UtlVbox(UtlHbox(2, 1)));
UtlHbox(UtlVbox(2, UtlHbox(1)));
UtlHbox(2, UtlVbox(UtlHbox(1)));
UtlHbox(3, UtlVbox(2, UtlHbox(1)));
UtlVbox(UtlHbox(UtlHbox(1)));
UtlVbox(UtlHbox(UtlHbox(2, 1)));
UtlVbox(UtlHbox(2, UtlHbox(1)));
UtlVbox(2, UtlHbox(UtlHbox(1)));
UtlVbox(3, UtlHbox(2, UtlHbox(1)));

...expands to:

Hbox(Hbox(Hbox(1, NULL), NULL), NULL);
Hbox(Hbox(Hbox(2, 1, NULL), NULL), NULL);
Hbox(Hbox(2, Hbox(1, NULL), NULL), NULL);
Hbox(2, Hbox(Hbox(1, NULL), NULL), NULL);
Hbox(3, Hbox(2, Hbox(1, NULL), NULL), NULL);
Hbox(Hbox(Vbox(1, NULL), NULL), NULL);
Hbox(Hbox(Vbox(2, 1, NULL), NULL), NULL);
Hbox(Hbox(2, Vbox(1, NULL), NULL), NULL);
Hbox(2, Hbox(Vbox(1, NULL), NULL), NULL);
Hbox(3, Hbox(2, Vbox(1, NULL), NULL), NULL);
Hbox(Vbox(Hbox(1, NULL), NULL), NULL);
Hbox(Vbox(Hbox(2, 1, NULL), NULL), NULL);
Hbox(Vbox(2, Hbox(1, NULL), NULL), NULL);
Hbox(2, Vbox(Hbox(1, NULL), NULL), NULL);
Hbox(3, Vbox(2, Hbox(1, NULL), NULL), NULL);
Vbox(Hbox(Hbox(1, NULL), NULL), NULL);
Vbox(Hbox(Hbox(2, 1, NULL), NULL), NULL);
Vbox(Hbox(2, Hbox(1, NULL), NULL), NULL);
Vbox(2, Hbox(Hbox(1, NULL), NULL), NULL);
Vbox(3, Hbox(2, Hbox(1, NULL), NULL), NULL);
H Walters
  • 2,634
  • 1
  • 11
  • 13
  • Solid answer! As mentioned in the OP, the reason I define the macro with `first` is because I also need to deal with the case where the box should be declared empty (i.e. `Vbox(NULL)`). Without `first`, the macro can be called `UtlVbox()`, but that will expand to `Vbox(, NULL)`, since I can't elide the comma *after* `__VA_ARGS__`. Using `first` informs the user that at least one parameter should be given to the macro (even if its `NULL`). – Wasabi Jun 17 '17 at 13:16
  • That being said, I've just realized that removing `first` and using these macros is still a better idea than using the functions themselves. Without `first`, if the user calls `UtlVbox()`, there will be a compilation error that'll be hard to understand (complaining about a comma that the user can't see in their own code). However, the functions themselves are really dangerous in that forgetting to add the NULL compiles just fine, but causes run-time errors. I think a hard-to-parse compilation error is better than a runtime error! – Wasabi Jun 17 '17 at 13:20