2

I'm writing C code for an embedded system. In this system, there are memory mapped registers at some fixed address in the memory map, and of course some RAM where my data segment / heap is.

I'm finding problems generating optimal code when my code is intermixing accesses to global variables in the data segment and accesses to hardware registers. This is a simplified snippet:

#include <stdint.h>

uint32_t * const restrict HWREGS = 0x20000;

struct {
    uint32_t a, b;
} Context;

void example(void) {
    Context.a = 123;
    HWREGS[0x1234] = 5;
    Context.b = Context.a;
}

This is the code generated on x86 (see also on godbolt):

example:
        mov     DWORD PTR Context[rip], 123
        mov     DWORD PTR ds:149712, 5
        mov     eax, DWORD PTR Context[rip]
        mov     DWORD PTR Context[rip+4], eax
        ret

As you can see, after having written the hardware register, Context.a is reloaded from RAM before being stored into Context.b. This doesn't make sense because Context is at a different memory address than HWREGS. In other words, the memory pointed by HWREGS and the memory pointed by &Context do not alias, but it looks like there is not way to tell that to the compiler.

If I change HWREGS definition as this:

extern uint32_t * const restrict HWREGS;

that is, I hide the fixed memory address to the compiler, I get this:

example:
        mov     rax, QWORD PTR HWREGS[rip]
        mov     DWORD PTR [rax+18640], 5
        movabs  rax, 528280977531
        mov     QWORD PTR Context[rip], rax
        ret
Context:
        .zero   8

Now the two writes to Context are optimized (even coalesced to a single write), but on the other hand the access to the hardware register does not happen anymore with a direct memory access but it goes through a pointer indirection.

Is there a way to obtain optimal code here? I would like GCC to know that HWREGS is at a fixed memory address and at the same time to tell it that it does not alias Context.

Jérôme Richard
  • 41,678
  • 6
  • 29
  • 59
Giovanni Bajo
  • 1,337
  • 9
  • 15

1 Answers1

1

If you want to avoid compilers reloading regularly values from a memory region (possibly due to aliasing), then the best is not to use global variables, or at least not to use direct accesses to global variables. The register keyword seems ignored for global variables (especially here on HWREGS) for both GCC and Clang. Using the restrict keyword on function parameters solves this problem:

#include <stdint.h>

uint32_t * const HWREGS = 0x20000;

struct Context {
    uint32_t a, b;
} context;

static inline void exampleWithLocals(uint32_t* restrict localRegs, struct Context* restrict localContext) {
    localContext->a = 123;
    localRegs[0x1234] = 5;
    localContext->b = localContext->a;
}

void example() {
    exampleWithLocals(HWREGS, &context);
}

Here is the result (see also on godbolt):

example:
        movabs  rax, 528280977531
        mov     DWORD PTR ds:149712, 5
        mov     QWORD PTR context[rip], rax
        ret
context:
        .zero   8

Please note that the strict aliasing rule do not help in this case since the type of read/written variables/fields is always uint32_t.

Besides this, based on its name, the variable HWREGS looks like a hardware register. Please note that it should be put volatile so that compiler do not keep it to registers nor perform any similar optimization (like assuming the pointed value is left unchanged if the code do not change it).

Jérôme Richard
  • 41,678
  • 6
  • 29
  • 59
  • Thanks! Any reason why it can't be made to work with global variables? Is it just because there's no way to specify "restrict" outside of the context of a function? – Giovanni Bajo May 14 '21 at 22:09
  • I suspect this is a limitation of the compilers and not the C standard but I am not sure. You can find the formal definition of the `restrict` keyword in the section 6.7.3.1 of the [C99 standard draft](http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf). If find this section pretty hard to understand, but there is an example with global variables. Note particularly the rule 6 stating that *a translator is free to ignore any or all aliasing implications of uses of `restrict`*. In practice, global variables cause issues in many optimization steps of current mainstream compilers. – Jérôme Richard May 14 '21 at 22:57
  • @JérômeRichard: The Standard allows the `restrict` qualifier on things other than automatic objects, but I don't think the authors gave much thought to corner cases. The semantics of restrict-qualified pointers are tightly coupled to the lifetime of objects containing them, since accesses made via restrict-qualified pointers are unsequenced with regard to accesses made by other means *during the lifetime of the restrict-qualified pointer objects*, but they become nebulous for objects whose lifetime never ends. – supercat May 18 '21 at 19:27