11

In this Q&A it is established that you should always call va_end():

What exactly is va_end for? Is it always necessary to call it?

But what if a piece of code longjmp's before you reach the va_end? Is there any promise on va_end's part that it will be okay? Or conceptually might it (for example) do a memory allocation in va_start() that would be leaked, vs. just using stack tricks?

Community
  • 1
  • 1
  • 3
    I suppose you would need to call `va_end` before calling `longjmp`, same way you should `free()` any mallocations being jumped over – M.M Aug 27 '15 at 21:23
  • Isn't that the point of the question? – Weather Vane Aug 27 '15 at 21:23
  • @WeatherVane Yes. The question is about what if what you are doing to process each argument item can longjmp. If so, then you have to take the variable args and put them in another place before doing your calls...but what place? Likely a bounded stack array with a size limit. Would be nice if what va_args had could already serve as that place...and in real implementations, it generally is. But the standard is unlikely to promise that. – HostileFork says dont trust SE Aug 27 '15 at 21:26
  • Can't you just organise your code (you don't even show it, I suppose you feel you are exempt) , to detect a forthcoming `longjmp` and clean up the varg system before doing it? I stand by my earlier (deleted in deference to high rep seniors) comment - it's a poorly judged question. – Weather Vane Aug 27 '15 at 21:35
  • @WeatherVane I'm able to do many things, but analyze arbitrary C functions and determine whether they longjmp or not [is unfortunately not one of those things](https://en.wikipedia.org/wiki/Halting_problem). – HostileFork says dont trust SE Aug 27 '15 at 21:39
  • @WeatherVane: In embedded systems, the most practical design for pseudo-multi-tasking is often to have a "main_poll()" method which is called while waiting for I/O, and which may sometimes need to perform a "partial reboot". The `longjmp` mechanism can work well for that, provided that a suitable means exists to handle any necessary cleanup first. – supercat Aug 27 '15 at 22:12
  • 1
    @supercat I've used a brutal method in embedded systems too as a last resort when all else fails but the show must go on, it makes the partial reboot you mention, nothing left dangling. – Weather Vane Aug 27 '15 at 22:15
  • @supercat: Huh? Sounds more like a hack due to poor system design for far by most cases. If you are talking about co-routines/coop-multitasking, there are definitively better ways to implement. But I agree, that this has been practice up to the 90ies as embedded devs were mostly enginers with littlt CS background, so they "just made it work". God luck maintaining such code. – too honest for this site Aug 27 '15 at 22:16
  • @Olaf: Not co-routines--each call to "main_poll()" would use a state machine to do some stuff the immediate caller wouldn't generally care about and return. Unless one wants to use state machines for everything and require that code back all the way out to top-level before running the polling tasks, I know of no way to allow code to process a "partial-reboot" request received during its "main_poll" other than using "longjmp" or having every method call that might indirectly invoke "main_poll" check an early-exit condition. – supercat Aug 27 '15 at 22:20
  • @supercat: that is quite similar to co-routines or coop-tasks which are also normally state-machine based (it is actually not that complicated if done properly). Well, there are better approaches if you use a modern event-based system architecture. But I very well know that many old-school devs fear such approaches like . – too honest for this site Aug 27 '15 at 22:37
  • 1
    @supercat in embedded systems with or without multitasking I found the best approach was an intermediate "heartbeat" layer driven by a timer, that would service information from interrupts and present it to the "main loop" as it were, though that is a simplified statement. The best way of handling errors, pass them from interrupts as status values, or from subroutines as return values. I see long jumps as the software equivalent of a short circuit. If all else fails, the equivalent of the reset button. If you *designed* long jumps into a stack based system because you could, shame on you. – Weather Vane Aug 27 '15 at 22:45
  • @WeatherVane: The systems where I've used longjmp would allocate things from a pool at startup according to the current configuration; in response to a "set configuration" command, they would abandon anything the main code might have been doing, reset the pool, and allocate new buffers from that, but processes that use static allocations could continue unaffected. It might have been possible to use a global `exit_ASAP` and scatter the code with a lot of `if (exit_ASAP) return;` statements, but it would have been harder to guarantee a timely response to the "set configuration" command. – supercat Aug 28 '15 at 16:20
  • @Olaf: I think of the term "coroutines" as implying two effective threads of execution, with are switched by calling a `spin()`, `poll()`, `task_switch()`, or other such function; systems using state machines and a fixed polling loop I describe as using state machines and a fixed polling loop. I like coroutines as an approach, though platforms vary as to how effectively and safely they can be implemented. It's possible to box oneself into a corner while using coroutines if a routine ends up running slower than expected, but its caller isn't expecting that it could task switch, but... – supercat Aug 28 '15 at 16:23
  • ...that can often be handled using some simple "busy" flags, and having things like incremental garbage-collection simply hold off while other tasks are busy. A couple nice things about coroutines are that (1) all operations between direct or indirect calls to the task-switch function are automatically atomic, and (2) on some platforms, if one thread is in a `while((uint16_t)(time-start_time) < duration) task_switch()` loop while the other thread calls `task_switch`, the time required to task switch to the while loop, check its condition, and task-switch back may be shorter than... – supercat Aug 28 '15 at 16:28
  • ...the time required for a more sophisticated OS to decide the thread that had control should keep it. – supercat Aug 28 '15 at 16:29

2 Answers2

8

The C99 rationale explicitly states that va_start may allocate memory that ends up freed by va_end, exactly what you guessed in your question:

7.15.1.2 The va_copy macro

[...]

30 A much simpler approach is to copy the va_list object used to represent processing of the arguments. However, there is no safe way to do this in C89 because the object may include pointers to memory allocated by the va_start macro and destroyed by the va_end macro.
The new va_copy macro provides this safe mechanism.

[...]

So yes, you need to invoke va_end before a longjmp. At the very least you'd otherwise have a memory leak on such an implementation.


Supposedly Pyramid OSx had an implementation where memory allocations were performed by va_start. Function arguments were passed in registers. This was the case even for variadic functions. It may have pre-dated ANSI C's invention of function prototypes, meaning the caller wouldn't know whether it was dealing with a variadic function. va_start allocated memory, presumably to store the function parameter values in a way that va_arg could easily access it. va_end freed the allocated memory.

Its implementation of va_start and va_end actually required matching va_start and va_end syntactically, because it was one that used unbalanced braces, so ANSI C already disallowed that implementation, but the same principle could be made to work while having matching braces.

I can find very little concrete information on this implementation, it's just bits and pieces on Usenet in the late '80s, early '90s. What little I did find may be incomplete or even just plain wrong. More details are very welcome, especially by anyone who used this implementation themselves.

  • 2
    I think it's also worth quoting the `va_end` rationale which contains `In many implementations, this is a do-nothing operation; but those implementations that need it probably need it badly.`. `badly` is very vague though. – cremno Aug 27 '15 at 21:29
  • *"those implementations that need it probably need it badly"* ... heh, pretty funny. :-) – HostileFork says dont trust SE Aug 27 '15 at 21:32
  • @cremno I'm trying to cover that by finding a concrete example of an implementation where `va_end` is not a no-op, but I'm doubtful I'll find any. –  Aug 27 '15 at 21:32
  • Just to add the current standard: [7.16.1.3p2](http://port70.net/~nsz/c/c11/n1570.html#7.16.1.3p2) and [7.16.1.2](http://port70.net/~nsz/c/c11/n1570.html#7.16.1.2). However, it only referes to "return", not longjmp and not about allocated memory anymore. It's not quite clear to me: `stdarg.h` does **not** require `stdlib.h` or any other memory allocation functions. So does it only refer to stack-allocated/auto memory? If so, wouldn't the longjmp treat this properly? – too honest for this site Aug 27 '15 at 22:25
  • @Olaf It's easily possible for `stdarg.h` to declare `void __va_end(va_list);` and then have `#define va_end(ap) __va_end(ap)`. Since the body of `__va_end` doesn't need to be visible, there doesn't need to be a visible declaration of `free` etc. in order to free memory. –  Aug 27 '15 at 22:40
  • @Olaf: a library implementation doesn't need to use `stdlib.h` or the functions in `stdlib.h` to perform allocation. – Michael Burr Aug 27 '15 at 22:43
  • @hvd: I'm curious from an embedded system's view, where you most times don't have the standard memory management functions and often not even the standard libraries ([freestanding implementation](http://port70.net/~nsz/c/c11/n1570.html#4p6)). While the environment is not required to provide any libraries, it has to provide some headers, including `stdarg.h`. Wouldn't this requirement imply that `stdarg.h` must **not** require an externally defined function? And as a hosted implementation is basically a _freestanding_ with extra requirements, wouldn't that also apply to such? – too honest for this site Aug 27 '15 at 22:48
  • @Olaf No, it doesn't imply that. A freestanding implementation isn't required to provide memory allocation functions, but it may do so anyway, and if it does do so there is nothing prohibiting it from using those functions internally. –  Aug 27 '15 at 22:51
  • @MichaelBurr: No, but it has to provide some kind of memory allocation. And for a freestanding environment the standard does not enforce this or any other linked library. But it does require `stdarg.h` to be provided by the implementation. – too honest for this site Aug 27 '15 at 22:52
  • @hvd: We are hopefully still talking about some non-auto memory allocation? Ok, so anything providing such allocation would have to be in the `stdarg.h` header, thus being local to the compilation unit. Hmm, I'll have to think about that. – too honest for this site Aug 27 '15 at 22:55
2

If you're using a jmp_buff stored in a global variable (the usual pattern), it should probably be safe to to make a copy of it and use setjmp so the longjmp will go to your code rather than the outer caller; in case of longjmp, your code could then call va_end and longjmp using the stored copy of the buffer; if your code exits normally, it would need to restore the global buffer before returning.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • @HostileFork in practical terms, if your code will not be run on a platform with non-trivial `va_end` then I'd probably just ignore the issue. However presumably you must have something in place to deal with heap allocations occurring in the jumped-over code . – M.M Aug 27 '15 at 23:53