1

I'm working with jit compilation by TCC (Tiny C Compiler) but it have a limited support for assembly and I get frequently stuck by this... I would like to know if there is some kind of trick to insert raw instructions into inline assembly? Such as:

mov -0x18(%rbp), %rax
finit
flds (%rax)

/* Custom unsupported binary instructions here */

flds (%rcx)

I know it won't be an easy maintainable thing, but I would like to keep TCC unmodified.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Wadosh
  • 13
  • 2

2 Answers2

2

If it supports standard GAS / unix-assembler directives like .byte 0x00, 0x12, you can emit any byte sequence you want. (Also .word or .long if you want to use write a 16 or 32-bit immediate as a single 32-bit number.)

GNU as manual

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
2

GNU as supports two nice ways to work with raw instructions. There are .byte, .short, .long and similar directives that can be used to directly emit the raw numbers in the output.

Another way (only implemented for a few architectures) is the .insn directive that allows for a middle way between ordinary assembly instructions and the raw numbers mentioned above. See for example the documentation for RISC-V .insn formats.

To use either of them within C you can use the (non-standard) asm statement (or variations of it). The following example uses GCC's naked function attribute (also only available on some architectures) to prevent any additional instructions from being emitted by the compiler (e.g., function prologue/epilogue), which can easily trash registers.

This example is for RISC-V and shows standard 32-bit instructions as well as compressed 16-bit instructions:

void __attribute__ ((naked)) rawinst (void) {
  __asm__ volatile ("li t5,1");
  __asm__ volatile (".short 0x4f05"); // li t5,1
  __asm__ volatile ("lui t5, 0x1c000");
  __asm__ volatile (".word 0x1c000f37"); // lui t5, 0x1c000
  __asm__ volatile (".insn r 0x33, 0, 0, a0, a1, a2"); // add a0, a1, a2
  __asm__ volatile ("ret");
}

If you want this inside a non-naked function, use GNU C Extended asm with constraints / clobbers to tell the compiler what your asm does and where the inputs/outputs are. And of course don't use your own ret. At least with gcc/clang; if you're actually using TCC, it doesn't optimize at all between C statements so it should be safe to us Basic asm statements.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
stefanct
  • 2,503
  • 1
  • 28
  • 32
  • 1
    `__attribute__((naked))` *is* necessary to use GNU C Basic asm statements like this, except for instructions that don't modify registers or memory. And if using such instructions inside a non-naked function, you probably want GNU C Extended asm anyway to specify a `"memory"` clobber or not to control ordering wrt. memory access by surrounding C statements. (Like if you were disabling interrupts or doing a memory barrier instructions). – Peter Cordes Jan 24 '22 at 20:58
  • 1
    If you modify registers outside of a `naked` function, you *must* tell the compiler about it. Otherwise your code can totally break when this function inlines into a larger caller. (And in that case you must not use instructions like `ret` yourself.) See https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended re: why it's terrible for anything except `naked` functions. – Peter Cordes Jan 24 '22 at 21:00
  • I have briefly mentioned these problems in my original answer (similar problems apply to yours btw), however, not all instructions work on registers or memory. I am working on a design with custom instructions that interact with an external module where these problems are nil. So it very much depends on the concrete application. I don't think anybody will stumble over this question who does not know what they are doing but if so, this is an excellent opportunity to shoot oneself in the foot for educational purposes ;) – stefanct Jan 24 '22 at 22:28
  • *working on a design with custom instructions that interact with an external module where these problems are nil* - Makes sense. But that's not the case for the example you included in your answer, which does write registers, so your answer needed fixing. There are *many* inline asm questions on SO that really are so naive that they think `asm("stuff")` should Just Work in a normal function, like MSVC style asm{} blocks do. Given the question title, I could easily imagine naive future readers landing here, not knowing how GNU C inline asm works. – Peter Cordes Jan 24 '22 at 22:39
  • In my answer, I didn't mention it, but I was *just* talking about GAS syntax, not how to use inline asm. Also, the question is tagged [TCC], which I wasn't sure even supported GNU C extended asm, so I just stuck to the GAS details. But given the question title and tags, it's likely it will attract some attention from other readers (since TCC is so nice), so I'm glad you posted this answer where we can talk about GNU C extensions. – Peter Cordes Jan 24 '22 at 22:43