4

In the case that a local jmp_buf is actually represented by registers rather than stack memory, is it possible for setjmp or longjmp to cause the contents of the local jmp_buf to be indeterminate when setjmp returns from a longjmp?


The suggested duplicate Is it allowed to do longjmp() multiple times for one setjmp() call? asks in the context of a global variable. It was suggested since the answer explains that the variable is not modified in a way that would prevent it from being subsequently called, that sufficiently answers the question for a local variable too.
However, treatment of a local variable differs from a global variable. In particular, if the local jmp_buf variable is actually held in registers and not memory, restoration after longjmp may not render a reusable jmp_buf variable.


As an academic exercise, I was attempting to use setjmp as a substitute for goto. To keep the loop replacement local to the function, the jmp_buf used is also a local variable.

void foo (int n) {
    jmp_buf jb;
    volatile int i;
    i = setjmp(jb);
    if (i < n) {
        do_stuff(i);
        longjmp(jb, ++i);
    }
}

I understand that non-volatile local variables that have been modified between the setjmp call and the longjmp call are unspecified after longjmp. However, I was curious about the local jmp_buf variable itself, particularly in the case where the jmp_buf variable is represented by registers rather than memory on the stack.

It is unclear if longjmp itself can be considered something that may modify the local jmp_buf variable, and whether this means its contents are unspecified when setjmp returns after the call to longjmp.

I thought I could easily dispatch the issue by declaring jb to be volatile, but this triggered a warning (which I treat as an error):

... error: passing argument 1 of ‘_setjmp’ discards ‘volatile’ qualifier from pointer target type [-Werror=discarded-qualifiers]
     setjmp(jb);
            ^~

Also, the specification of setjmp does not speak to whether it is saving the register values as they would be after setting the jmp_buf or before setting the jmp_buf.

If I need to be concerned about it, I can create a volatile copy of the jmp_buf and copy its contents around. But, I'd like to avoid that if it isn't required.

jxh
  • 69,070
  • 8
  • 110
  • 193
  • Does this answer your question? [Is it allowed to do longjmp() multiple times for one setjmp() call?](https://stackoverflow.com/questions/64175221/is-it-allowed-to-do-longjmp-multiple-times-for-one-setjmp-call) tl;dr: you can reuse a jmp_buf , longjmp won't corrupt it. – jthill May 09 '22 at 17:25
  • @jthill Close. The question is is about non-local jumps, and so uses a global `jmp_buf`. See the comments below the accepted answer. – jxh May 09 '22 at 17:27
  • It doesn't matter, the longjmp can't end the jmp_buf's lifetime and that's the only effect storage duration can have. – jthill May 09 '22 at 17:29
  • @jthill But the first comment literally says it doesn't apply to local variables. So the answer doesn't address the question, right? – jxh May 09 '22 at 17:30
  • You're going to have to read the standard and decide for yourself. If you read further into those comments you'll see people pointing out reasons to disbelieve the one you've latched on to. The idea that an object's possible future "death" might have some effect on its present value seems ... well, to be blunt, it strikes me as nonsense. setjmp is required to render the jmp_buf usable. – jthill May 09 '22 at 17:36
  • you can easily write your own [setjmp()/longjump()](https://github.com/user1095108/cr2) with exactly the behavior you want. – user1095108 May 09 '22 at 17:36
  • @jthill Your current position is significantly different from "this should be closed as a duplicate". – jxh May 09 '22 at 17:38
  • @user1095108: How much stack is a signal handler allowed to use? – Joshua May 09 '22 at 17:43
  • I'm not seeing how the presence of a comment I think you should ignore because it's wrong changes anything else about that Q&A. – jthill May 09 '22 at 23:50
  • @jthill You said "You're going to have to read the standard and decide for yourself.", which I had already done before posting the question. I believe the question is unresolved. – jxh May 09 '22 at 23:55

3 Answers3

2

The C11 standard section §7.13.2.1 point 3 states:

All accessible objects have values, and all other components of the abstract machine have state, as of the time the longjmp function was called, except that the values of objects of automatic storage duration that are local to the function containing the invocation of the corresponding setjmp macro that do not have volatile-qualified type and have been changed between the setjmp invocation and longjmp call are indeterminate.

Your jmp_buf object is not changed between setjmp(jb) and longjmp(jb, ++i). The only variable which is changed between the calls is i, which is declared volatile, as the standard suggests.

So, to answer your question, longjmp cannot by itself "modify the contents of the local jmp_buf [in such a way] that would cause its contents to be undefined when setjmp returns", but modifying the jmp_buf between the two calls through other means could definitely cause trouble.

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
  • There are multiple ways to interpret "between", as it could be inclusive or exclusive. – jxh May 09 '22 at 20:41
  • 2
    @jxh IMHO there is exactly one sensible way to interpret that "between": the `setjmp()` call happens - a modification happens - the `longjmp()` call happens. Such a modification would be "between" the two calls. What happens *inside* the calls is none of the programmer's business. If library documentation tells me to *"not do X between the invocation of F and G"*, the only control I have is on *my code*; if F and G can break theirselves even if I follow the rule, then the library would be broken by design. – Marco Bonelli May 09 '22 at 22:52
  • @jxh (one could also probably ask why on earth would anyone want to bother with `setjmp`/`longjmp` when you could just `goto local_label` in such scenario :P) – Marco Bonelli May 10 '22 at 00:25
  • I can see a use case where the `longjmp` is actually embedded in some kind of error handling macro, and the error handler could be invoked locally to the function that issued `setjmp` or in some called function. The circumstances where the `jmp_buf` itself is local would be more farfetched, but could happen if it was embedded within a context data structure that is local variable to a thread start function. – jxh May 10 '22 at 01:51
  • This answer cannot be accepted as is, because it is missing an explanation of the semantics of `setjmp` with respect to what register state it has preserved. If it has preserved the state of registers prior to setting values in the `jmp_buf`, then it is possible that the `longjmp` ends up restoring the indeterminate values held by the buffer prior to the `setjmp` call. – jxh May 12 '22 at 01:57
  • @jxh existing standards define no such semantics, so I can't really materialize an explanation out of thin air. Both the C standard and the POSIX standard say the same thing, which is what I quote above. I added my own interpretation of what you deem "open to interpretation", but that's about it. As it currently stands, it seems that this is all you get unfortunately. – Marco Bonelli May 12 '22 at 02:22
  • That is also my understanding, that the standard does not define what state actually gets preserved. Therefore, it seems possible that the state restored presents the uninitialized `jmp_buf`. – jxh May 12 '22 at 02:28
  • For your answer to hold, you should cite the *Returns* section for `longjmp`, which it describes the behavior should be *as if the corresponding invocation of the `setjmp` macro had just returned the value specified by `val`.* You might infer from this various things, but if it is as if `setjmp` is returning, then it should behave as if `setjmp` had finished initializing `jmp_buf`. – jxh May 12 '22 at 02:37
  • @jxh I don't see how quoting that helps to be honest. – Marco Bonelli May 12 '22 at 02:55
  • I am offering you a suggestion that would make it feasible for me to accept your answer. There needs to be a justification that the state after `longjmp` presents an intact `jmp_buf` rather than one that was uninitialized. – jxh May 12 '22 at 03:54
  • @jxh I can see that, but don't see how *"After longjmp is completed, thread execution continues as if the corresponding invocation of the setjmp macro had just returned the value specified by val"* clarifies anything. You are concerned about whether or not the "between" should be inclusive or exclusive of the two function calls, and `longjmp()` is clearly not the problem here, as it merely reads `jmp_buf`; the problem is the initial `setjmp()`. If using a local non-volatile `jmp_buf` was illegal, then you'd already be breaking the rule at the first `setjmp()` invocation. – Marco Bonelli May 12 '22 at 15:05
  • The inference of the *as if invocation of ... `setjmp` ... had just returned* is that upon return, the specified behavior, that `setjmp` presents an initialized `jmp_buf`, is the effect after `longjmp`. Because it is returning as if it had been invoked. – jxh May 12 '22 at 15:36
0

That's fine.

On a related note, you don't need volatile on i because it's assigned to by setjmp().

On a very careful reading of the man page for longjmp() and my copy of K&R C, the contents of jb are only invalid within the body of your function, meaning if there were a second call to longjmp(), it would see a valid view of jb. Under the resaonable assumption that valid code does not become invalid in newer standard versions, this will still apply today.

TL;DR you don't need to mark variables of type jmp_buf volatile.

Joshua
  • 40,822
  • 8
  • 72
  • 132
0

TL;DR Since the standard isn't clear, it is better to treat the value of a local jmp_buf as indeterminate after a local longjmp.

ISO/IEC 9899:2018 §17.13.1.1 ¶2 describes the behavior of setjmp, and ¶3 describes what happens on return.

The setjmp macro saves its calling environment in its jmp_buf argument for later use by the longjmp function.

...

If the return is from a direct invocation, the setjmp macro returns the value zero. If the return is from a call to the longjmp function, the setjmp macro returns a nonzero value.

We infer that a successful return from setjmp results in an initialized jmp_buf argument. However, there is no mention if the initialization takes into account of the jmp_buf itself having automatic storage duration (and so, itself could be represented by registers rather than by memory).

ISO/IEC 9899:2018 §7.13.2.1 ¶3 describes the behavior of longjmp, and is worded the same as the 2011 text cited by Marko:

All accessible objects have values, and all other components of the abstract machine254) have state, as of the time the longjmp function was called, except that the values of objects of automatic storage duration that are local to the function containing the invocation of the corresponding setjmp macro that do not have volatile-qualified type and have been changed between the setjmp invocation and longjmp call are indeterminate.


254)This includes, but is not limited to, the floating-point status flags and the state of open files.

However, the meaning of the word between is somewhat elusive. The standard could have explicitly specified the context of between to mean after setjmp completed. For example, the wording could have stated:

... changed between the setjmp return and longjmp call are indeterminate.

The current wording suggests that one should include the invocation of setjmp itself as something that may trigger the indeterminate condition.

There is a possibility that the semantics of the return of longjmp covers for this problem, however. ISO/IEC 9899:2018 §17.13.2.1 ¶4 states:

After longjmp is completed, thread execution continues as if the corresponding invocation of the setjmp macro had just returned the value specified by val. ...

This sentence could be interpreted to mean that the invocation semantics of setjmp is the same whether it returns from direct invocation or returns from a longjmp function. That is, the return of setjmp means the jmp_buf argument is initialized and can be used by another longjmp. But again, this is not clear. In the most limiting interpretation, the as if clause only speaks to the value returned by setjmp, and not the invocation itself.

Since the semantics are ambiguous, it is proper to treat the jmp_buf object value as indeterminate upon return from longjmp.

jxh
  • 69,070
  • 8
  • 110
  • 193