21

Many times I want a function to receive a variable number of arguments, terminated by NULL, for instance

#define push(stack_t stack, ...) _push(__VARARG__, NULL);
func _push(stack_t stack, char *s, ...) {
    va_list args;
    va_start(args, s);
    while (s = va_arg(args, char*)) push_single(stack, s);
}

Can I instruct gcc or clang to warn if foo receives non char* variables? Something similar to __attribute__(format), but for multiple arguments of the same pointer type.

mikebloch
  • 1,577
  • 11
  • 21

3 Answers3

16

I know you're thinking of using __attribute__((sentinel)) somehow, but this is a red herring.

What you want is to do something like this:

#define push(s, args...) ({                   \
  char *_args[] = {args};                     \
  _push(s,_args,sizeof(_args)/sizeof(char*)); \
})

which wraps:

void _push(stack_t s, char *args[], int argn);

which you can write exactly the way you would hope you can write it!

Then you can call:

push(stack, "foo", "bar", "baz");
push(stack, "quux");
geocar
  • 9,085
  • 1
  • 29
  • 37
  • @Chi-Lan - Depending on the implementation of `_push()` GCC will remove unnecessary loads. – geocar May 20 '12 at 18:41
  • I am trying to do something similar to this. I really want an __attribute that guarantees type safety of the varargs. Basically I want to make sure they're all char*'s. Anyone know how to do that? – Mark Pauley Jan 22 '14 at 22:41
2

I can only think of something like this:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct tArg
{
  const char* Str;
  struct tArg* Next;
} tArg;

tArg* Arg(const char* str, tArg* nextArg)
{
  tArg* p = malloc(sizeof(tArg));
  if (p != NULL)
  {
    p->Str = str;
    p->Next = nextArg;
  }
  else
  {
    while (nextArg != NULL)
    {
      p = nextArg->Next;
      free(nextArg);
      nextArg = p;
    }
  }
  return p;
}

void PrintR(tArg* arg)
{
  while (arg != NULL)
  {
    tArg* p;
    printf("%s", arg->Str);
    p = arg->Next;
    free(arg);
    arg = p;
  }
}

void (*(*(*(*(*(*(*Print8
  (const char* Str))
  (const char*))
  (const char*))
  (const char*))
  (const char*))
  (const char*))
  (const char*))
  (const char*)
{
  printf("%s", Str);
  // There's probably a UB here:
  return (void(*(*(*(*(*(*(*)
    (const char*))
    (const char*))
    (const char*))
    (const char*))
    (const char*))
    (const char*))
    (const char*))&Print8;
}

int main(void)
{
  PrintR(Arg("HELLO", Arg(" ", Arg("WORLD", Arg("!", Arg("\n", NULL))))));
//  PrintR(Arg(1, NULL));        // warning/error
//  PrintR(Arg(&main, NULL));    // warning/error
//  PrintR(Arg(0, NULL));        // no warning/error
//  PrintR(Arg((void*)1, NULL)); // no warning/error

  Print8("hello")(" ")("world")("!")("\n");
// Same warning/error compilation behavior as with PrintR()
  return 0;
}
Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
-1

The problem with C variadics is that they are really bolted on afterwards, not really designed into the language. The main problem is that the variadic parameters are anonymous, they have no handles, no identifiers. This leads to the unwieldy VA macros to generate references to parameters without names. It also leads to the need to tell those macros where the variadic list starts and what type the parameters are expected to be of.

All this information really ought to be encoded in proper syntax in the language itself.

For example, one could extend existing C syntax with formal parameters after the ellipsis, like so

void foo ( ... int counter, float arglist );

By convention, the first parameter could be for the argument count and the second for the argument list. Within the function body, the list could be treated syntactically as an array.

With such a convention, the variadic parameters would no longer be anonymous. Within the function body, the counter can be referenced like any other parameter and the list elements can be referenced as if they were array elements of an array parameter, like so

void foo ( ... int counter, float arglist ) {
  unsigned i;
  for (i=0; i<counter; i++) {
    printf("list[%i] = %f\n", i, arglist[i]);
  }
}

With such a feature built into the language itself, every reference to arglist[i] would then be translated to the respective addresses on the stack frame. There would be no need to do this via macros.

Furthermore, the argument count would automatically be inserted by the compiler, further reducing opportunity for error.

A call to

foo(1.23, 4.56, 7.89);

would be compiled as if it had been written

foo(3, 1.23, 4.56, 7.89);

Within the function body, any access to an element beyond the actual number of arguments actually passed could be checked at runtime and cause a compile time fault, thereby greatly enhancing safety.

Last but not least, all the variadic parameters are typed and can be type checked at compile time just like non-variadic parameters are checked.

In some use cases it would of course be desirable to have alternating types, such as when writing a function to store keys and values in a collection. This could also be accommodated simply by allowing more formal parameters after the ellipsis, like so

void store ( collection dict, ... int counter, key_t key, val_t value );

This function could then be called as

store(dict, key1, val1, key2, val2, key3, val3);

but would be compiled as if it had been written

store(dict, 3, key1, val1, key2, val2, key3, val3);

The types of actual parameters would be compile time checked against the corresponding variadic formal parameters.

Within the body of the function the counter would again be referenced by its identifier, keys and values would be referenced as if they were arrays,

key[i] refers to the key of the i-th key/value pair value[i] refers to the value of the i-th value pair

and these references would be compiled to their respective addresses on the stack frame.

None of this is really difficult to do, nor has it ever been. However, C's design philosophy simply isn't conducive to such features.

Without a venturing C compiler implementor (or C preprocessor implementor) taking the lead to implement this or a similar scheme it is unlikely we will ever see anything of this kind in C.

The trouble is that folks who are interested in type safety and willing to put in the work to build their own compilers usually come to the conclusion that the C language is beyond salvage and one may as well start over with a better designed language to begin with.

I have been there myself, eventually decided to abandon the attempt, then implement one of Wirth's languages and added type safe variadics to that instead. I have since run into other people who told me about their own aborted attempts. Proper type safe variadics in C seem poised to remain elusive.

trijezdci
  • 5,284
  • 2
  • 19
  • 15
  • The reason why doesn't have to do with the syntax, but the changes to the ABI: The x86_64 ABI and i386 fastcall ABI's for example don't require that arguments have an "address", and while stack engines on modern CPUs really do a lot for the cost of spilling arguments, the double-waste of the precious L1 makes it a net-loss. If you just wanted to play with the syntax and the expressiveness, you could try C4 since it has its own (virtual) ABI anyway: https://github.com/rswier/c4 – geocar Feb 02 '16 at 15:33
  • If this syntax was designed into the language, then implementors would have to map it to their respective ABIs. There is no reason why you couldn't do that. The references are resolved at compile time (at least in terms of relocatable references) and the compiler has knowledge of the target ABI. Furthermore, there is no reason why a variadic argument list couldn't be mapped to a register set. The implementation effort is pretty much the same as with implementing the VA macros. The real issue is that the variadic parameters are anonymous. That stands in the way of making it type safe. – trijezdci Feb 02 '16 at 15:38
  • C was designed around an ABI, not the other way around. If you want a Pascal that looks like C, you could build one out of c4 like I suggest -- it's architecturally similar to many Pascal implementations. – geocar Feb 02 '16 at 15:43
  • The ABI is totally irrelevant. The issue is the anonymity of the variadic formal parameters. – trijezdci Feb 02 '16 at 15:45
  • Nonsense. The ABI is absolutely relevant: If the number of arguments (for example) were stored in `%r31` then whether the subsequent values are in registers or not is irrelevant: `args[i]` is simply a much more complex operation like `get_arg(i,frame)` – geocar Feb 02 '16 at 15:49
  • VA macros are the case in point. If it wasn't possible to implement references to variadic arguments then VA macros couldn't be implemented. Whether the arguments are anonymous or named does not fundamentally change this. In fact you could implement a preprocessor that translates named and typed variadic argument syntax into existing C syntax using VA macros. If it can be mapped indirectly, then it can also be mapped directly. – trijezdci Feb 02 '16 at 15:52
  • And last but not least, if you add new functionality, since this wouldn't replace existing untyped variadics anyway, then you can also extend the parameter passing conventions. A compiler switch could be used to generate object code either for use with compilers that also adhere to the extended passing convention, or alternatively, generate object code for use with any compiler by translating to VA macros first. – trijezdci Feb 02 '16 at 15:57
  • Nonsense. Z-series has an ABI that makes it possible to get the number of arguments, and the type of each argument, without knowing their names, simply because the ABI uses an array of arguments. The language is irrelevant. – geocar Feb 02 '16 at 15:57
  • That "compiler switch" you're talking about changes the ABI. – geocar Feb 02 '16 at 15:58
  • That's exactly my point. It does not matter for the ABI whether the args are named or not. But it matters for enforcing type safety, because the compiler cannot enforce types that are not mentioned in the function prototype. – trijezdci Feb 02 '16 at 15:58
  • In any event, you can implement the scheme I described above WITHOUT breaking the existing ABI. I know because I did that several years ago. The reason I abandoned the effort wasn't because I couldn't do it but because there are too many other things in C that need fixing which type safe languages already take care of, thus overall it was more convenient to take one of those and add variadics than taking C and make it type safe. – trijezdci Feb 02 '16 at 16:02