1

I tried to set an external variable and get its value afterwards, but the value I got was not correct. Do external variables always need to be volatile when compiled with gcc?

The code is as follows (updated the complete code, the previous access to the memory address 0x00100000 is changed to the another variable):

main.c

extern unsigned int ex_var;
extern unsigned int another;

int main ()
{
    ex_var = 56326;

    unsigned int *pos=&ex_var+16;

    for (int i = 0; i < 6; i++ )
    {
        *pos++ = 1;
    }
    another = ex_var;
}

another.c

unsigned int ex_var;  // the address of this variable is set to right before valid_address
unsigned int valid_address[1024];  // to indicate valid memory address
unsigned int another;

And the value set to another is not 56326.

Note: another.c seems to be strange to indicate that the memory region after ex_var is valid. For the actual running example on bear metal, please refer to this post.

Here is the disassembly of main.c. It is compiled with x86-64 gcc 12.2 with -O1 option:

main:
        mov     eax, OFFSET FLAT:ex_var+64
.L2:
        add     rax, 4
        mov     DWORD PTR [rax-4], 1
        cmp     rax, OFFSET FLAT:ex_var+88
        jne     .L2
        mov     eax, DWORD PTR ex_var[rip]
        mov     DWORD PTR another[rip], eax
        mov     eax, 0
        ret

It can be found that the code for setting the external variable ex_var is optimized out.

I tried several versions of gcc, including x86-64 gcc, x86 gcc, arm64 gcc, and arm gcc, and it seems that all tested gcc versions above 8.x have such issue. Note that optimization option -O1 or above is needed to reproduce this issue.

The code can be found at this link at Compiler Explorer.

Update:

This bug in the above code is not related to external references.

Here is another example code that has the same bug. Note that it should be compiled with -O1 or above. You can try it at Compiler Explorer.

#include <stdio.h>
unsigned int var;
// In embedded environment, LD files can be used to make valid_address stores right after var
volatile unsigned int valid_address[1024];
int main ()
{
    var = 56326;
    unsigned int *ttb=&var;
    ttb += 16;
    for (int i = 0; i < 8; i++ )
    {
        *ttb++ = 1;
    }
    valid_address[0] = var;
    printf("Value is: %d", valid_address[0]);
}

If you compile this code with gcc like

gcc -O1 main1.c

and execute this code, you might get the following output:

Value is: 0

Which is not correct.

Feihu Liu
  • 23
  • 6
  • 2
    You have undefined behavior in `unsigned int *pos=&ex_var+16;` so this code can legally do anything... including wipe your hard drive. So the compiler optimizing out the write is perfectly legal. – Mgetz Sep 06 '22 at 15:36
  • Note: once you get rid of the undefined behavior with that line [it stops optimizing the write out](https://godbolt.org/z/8YGea8Y7z) there is still legally other undefined behavior here but it's `volatile` so the compiler will allow that. – Mgetz Sep 06 '22 at 15:44
  • 1
    Removing the code causing undefined behavior (or [replacing it with some valid code](https://godbolt.org/z/91K8Ms51d)) is magically restoring the write into `ex_var` – Eugene Sh. Sep 06 '22 at 15:47
  • 1
    What makes you think that `0x00100000` is an address you can legitimately write to? What makes you think you can write 16 or more integers beyond your simple scalar global variable? Both those are undefined behaviour — anything can happen. Don't do it! (The succinct answer to the question's title is "No" — you don't have to make global variables volatile as long as you don't invoke undefined behaviour, and if you do invoke undefined behaviour, making the variables volatile is not guaranteed to fix anything.) – Jonathan Leffler Sep 06 '22 at 15:49
  • 1
    @JonathanLeffler I believe the write to `0x00100000` in this case is just for demonstration purposes, it is not running on any real system but just to produce the assembly. On a real system it might be or not be a valid address to access. – Eugene Sh. Sep 06 '22 at 15:51
  • 1
    @EugeneSh. — if you say so, but IMO it obfuscates the code. – Jonathan Leffler Sep 06 '22 at 15:52
  • The code here is actually used in a bear metal ARM environment and both the address after `&ex_var` and the address 0x00100000 are valid. But I think the above comments are correct, it confused the compiler. – Feihu Liu Sep 06 '22 at 16:08
  • 2
    Q: Do external variables always need to be volatile when compiled with gcc? A: No, absolutely not. The problem is that `unsigned int *pos=&ex_var+16;` is [undefined behavior](https://www.geeksforgeeks.org/undefined-behavior-c-cpp/), so your compiler happened to optimize it away. As you've discovered, a workaround is [volatile](https://www.geeksforgeeks.org/understanding-volatile-qualifier-in-c/). – paulsm4 Sep 06 '22 at 17:05
  • @paulsm4 Referring to my Update, if firstly the address of the variable is got with `unsigned int *ttb=&var;`, then do arithmetic operation to the pointer like `ttb += 16;`, there should be no undefined vehavior here? But the result is still incorrect. – Feihu Liu Sep 07 '22 at 02:39

1 Answers1

2

The calculation &ex_var+16 is not defined by the C standard (because it only defines pointer arithmetic within an object, including to the address just beyond its end) and the assignment *pos++ = 1 is not defined by the C standard (because, for the purposes of the standard, pos does not point to an object). When there is behavior not defined by the C standard on a code path, the standard does not define any behavior on the code path.

You can make the behavior defined, to the extent the compiler can see, by declaring ex_var as an array of unknown size, so that the address calculation and the assignments would be defined if this translation unit were linked with another that defined ex_var to be an array of sufficient size:

extern unsigned int ex_var[];

int main ()
{
    ex_var[0] = 56326;

    unsigned int *pos = ex_var+16;

    for (int i = 0; i < 6; i++ )
    {
        *pos++ = 1;
    }
    *(volatile unsigned int*)(0x00100000) = ex_var[0];
}

(Note that *(volatile unsigned int*)(0x00100000) = remains not defined by the C standard, but GCC is intended for some use in bare-metal environments and appears to work with this. Additional compilation switches might be necessary to ensure it is defined for GCC’s purposes.)

This yields assembly that sets ex_var[0] and uses it in the assignment to 0x00100000:

main:
        mov     DWORD PTR ex_var[rip], 56326
…
        mov     eax, DWORD PTR ex_var[rip]
        mov     DWORD PTR ds:1048576, eax
        mov     eax, 0
        ret
Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312