11

Suppose I have some inline assembly that needs a particular char value in ah, bh, ch, or dh. How can I tell GCC to put it there? I don't see a relevant constraint to do that, but the GCC manual says "If you must use a specific register, but your Machine Constraints do not provide sufficient control to select the specific register you want, local register variables may provide a solution", so I tried that:

void f(char x) {
    register char y __asm__("ah") = x;
    __asm__ __volatile__(
        "# my value ended up in %0" :: "a"(y)
    );
}

But it didn't work. It put it in al instead:

        movb    4(%esp), %al
        # my value ended up in %al

The x86-specific Q constraint also looks close to what I want, so I tried it in place of a, but it had the same result. I also tried with the more-generic r.

Interestingly, when I compile with Clang instead of GCC (whether with a, Q, or r), then I do get the desired result:

        movb    4(%esp), %ah
        # my value ended up in %ah

I also tried with bh, ch, and dh in place of ah, and every combination of them led to analogous results.

I also tried compiling as 64-bit instead of 32-bit. There, GCC still does basically the same wrong thing:

        movl    %edi, %eax
        # my value ended up in %al

And Clang utterly failed to compile with Cannot encode high byte register in REX-prefixed instruction unless I turned off optimizations (which I opened LLVM bug #45865 about), in which case it did eventually get the value in the right place:

        movb    %dil, -1(%rsp)
        movb    -1(%rsp), %al
        movb    %al, -2(%rsp)
        movb    -2(%rsp), %ah
        # my value ended up in %ah

Is this a bug in GCC that I should report, or is this something that's not supposed to work and is only working by chance in Clang? If the latter, is there a way to do what I want, or will I have to settle for moving it there from somewhere else myself from within the assembly?

32-bit Godbolt link. 64-bit Godbolt link.

  • 1
    Shouldn't it be `"r"(y)`? – fuz May 10 '20 at 00:16
  • Anyway; I had the same problem on ia16-gcc. Never managed to find a way. – fuz May 10 '20 at 00:17
  • @fuz Maybe, but that doesn't change anything (still works on Clang but not on GCC). – Joseph Sible-Reinstate Monica May 10 '20 at 00:19
  • 2
    What's worse is if you have both `ah` and `al` ... gcc silently ignores one of them. – Jester May 10 '20 at 00:23
  • 3
    If you ask for `(int)y<<8` in EAX, you can use `%h0` and `%b0` in the template to expand to AH and AL. (Or BH and BL if GCC picked EBX). https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86-Operand-Modifiers. A `"Q"` constraint makes sure the `%h0` modifier can work by picking one of E[ABCD]X. That's why the Q constraint exists, not get GCC to actually *pick* AH on it own. I'd have hoped it would be compatible with a `register char foo asm("ah");` but that is a separate thing. – Peter Cordes May 10 '20 at 00:51
  • 2
    re: clang: with optimization enabled I assume it was trying to emit `mov %dil, %ah` which is not encodeable. (DIL needs a REX; AH is only encodeable without a REX). Any ICE could be considered a bug, but the bugfix might just be a better warning when a `register asm` variable provokes it, or making clang work like GCC and picking AL instead of AH! Or possibly emitting `rorx $24, %edx, %eax` to get DIL into AH in one BMI2 instruction while avoiding the REX problem. – Peter Cordes May 10 '20 at 01:09
  • Have you tried padding the 8-bit value into a 32-bit value then putting it into "a"? – Boann May 10 '20 at 14:16
  • 2
    This is totally off in another direction, but I'm curious what was compelling you to put something in AH. I can think of a use case but that is related to 16-bit real mode. Although because of bug/deficiency you can't load AH directly with a constraint on GCC as of yet you can always move something into AH inside the inline assembly. Or simply use EAX and store `(int)y<<8` as peter mentioned. What you are asking is doable, but not directly (efficiently) as you'd like. Places I have had to deal with this issue is in using inline assembly to make BIOS calls. – Michael Petch May 10 '20 at 17:22
  • The `"a"` constraint specifies al/ax/eax/rax (depending on size), so its not suprising it puts it there. Try the `"r"` constraint to allow a different register. – Chris Dodd May 11 '20 at 07:31
  • @ChrisDodd "I also tried with the more-generic `r`." – Joseph Sible-Reinstate Monica May 11 '20 at 13:45

1 Answers1

1

Apparently, constraints don't allow selecting the nested registers, but you can add an h modifier to instruction references. This is mentioned in the docs on Input Operands. For example,

void f(char x) {
    char a;
    __asm__ __volatile__(
        "mov  %0, %h1" :: "X"(x), "a"(a)
    );
}

produces

f:
        xorl    %eax, %eax
        mov  4(%esp), %ah
        ret

I haven't been able to get rid of the xor that clears eax. My guess is the code generator is interpreting "%h1" as a 32-bit word with 8 bits set, not a character register reference. For example, this:

char f(char x) {
    char a;
    __asm__ __volatile__(
        "movb  %0, %h1" :: "X"(x), "a"(a)
    );
    return a;
}

... compiles to the same code, even though it returns \0, not very intuitive.

Gene
  • 46,253
  • 4
  • 58
  • 96
  • Your second C snippet looks like straight-up Undefined Behavior, since it's returning `a` without having ever set it. – Joseph Sible-Reinstate Monica May 11 '20 at 04:27
  • 1
    Your examples don't quite make sense because you have got both registers as input operands. But anyway, the register modifier causes the generated assembly to refer to `%ah`, but the compiler still doesn't know that, and will try to load the input value into `%al` instead. You can't see that in your example because you didn't initialize `a`. But if you do initialize it, as in [this example on godbolt](https://godbolt.org/), you see that the resulting code is not what you want. – Nate Eldredge May 11 '20 at 04:55
  • 1
    In other words, I think `%h1` is completely opaque to the code generator. The compiler generates the surrounding code as if you had just written plain `%1`, and then just changes the name of the register in your assembly. So this really doesn't solve the problem. – Nate Eldredge May 11 '20 at 04:58
  • 1
    I think the register modifier feature is meant for a situation when you have a value loaded into a 32-bit register but your assembly only wants to operate on part of it. For instance `unsigned a=..., x; asm("movb $0x5a, %h0" : "=Q" (x) : "0" (a))` would be a valid way to achieve `x = a & 0xffff00ff | 0x5a00` in one instruction. – Nate Eldredge May 11 '20 at 05:12
  • @NateEldredge The mov instruction sets `a`. – Gene May 12 '20 at 05:36
  • @Gene No it doesn't. `a` is only marked as an input by your constraints, not an output, so GCC doesn't care what change you make to its register in the assembly. – Joseph Sible-Reinstate Monica May 14 '20 at 01:48