GNU as
supports two nice ways to work with raw instructions. There are .byte
, .short
, .long
and similar directives that can be used to directly emit the raw numbers in the output.
Another way (only implemented for a few architectures) is the .insn
directive that allows for a middle way between ordinary assembly instructions and the raw numbers mentioned above. See for example the documentation for RISC-V .insn
formats.
To use either of them within C you can use the (non-standard) asm
statement (or variations of it). The following example uses GCC's naked
function attribute (also only available on some architectures) to prevent any additional instructions from being emitted by the compiler (e.g., function prologue/epilogue), which can easily trash registers.
This example is for RISC-V and shows standard 32-bit instructions as well as compressed 16-bit instructions:
void __attribute__ ((naked)) rawinst (void) {
__asm__ volatile ("li t5,1");
__asm__ volatile (".short 0x4f05"); // li t5,1
__asm__ volatile ("lui t5, 0x1c000");
__asm__ volatile (".word 0x1c000f37"); // lui t5, 0x1c000
__asm__ volatile (".insn r 0x33, 0, 0, a0, a1, a2"); // add a0, a1, a2
__asm__ volatile ("ret");
}
If you want this inside a non-naked
function, use GNU C Extended asm with constraints / clobbers to tell the compiler what your asm does and where the inputs/outputs are. And of course don't use your own ret
. At least with gcc/clang; if you're actually using TCC, it doesn't optimize at all between C statements so it should be safe to us Basic asm statements.