12

I want to cause an ARM Cortex-M3 Undefined Instruction exception for the test of my test fixture. The IAR compiler supports this with inline assembly like this:

asm("udf.w #0");

Unfortunately the GNU CC inline assembler does not know this opcode for the NXP LPC177x8x. It writes the diagnostic:

ccw3kZ46.s:404: Error: bad instruction `udf.w #0'

How can I create a function that causes a Undefined Instruction exception?

artless noise
  • 21,212
  • 6
  • 68
  • 105
harper
  • 13,345
  • 8
  • 56
  • 105
  • According to http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0337e/I1010015.html blx from thumb16 is not supported by M3 and "always faults". – Aki Suihkonen Apr 18 '13 at 11:59

4 Answers4

14

Building on Masta79's answer:

There is a "permanently undefined" encoding listed in the ARMv7-M architecture reference manual - ARM DDI 0403D (documentation placeholder, registration required). The encoding is 0xf7fXaXXX (where 'X' is ignored). Of course instruction fetches are little-endian, so (without testing):

asm volatile (".word 0xf7f0a000\n");

should yield a guaranteed undefined instruction on any ARMv7-M or later processor.

unixsmurf
  • 5,852
  • 1
  • 33
  • 40
  • 2
    Cortex-M3 has only Thumb mode. It is however Thumb with Thumb-2 extensions, meaning there are both 16-bit and 32-bit instructions. The "permanently undefined" one is a 32-bit one. – unixsmurf Apr 18 '13 at 19:56
  • 1
    I don't think you should manually change the endian here. The assembler will automatically swap endian as appropriate. – tangrs Apr 18 '13 at 22:26
  • @tangrs: yes, the assembler will automatically byte-reverse the word, but 32-bit Thumb instructions are made up of 16-bit entities, which also need to be in little-endian order. – unixsmurf Apr 19 '13 at 07:00
  • 1
    when you put `.word` that word also written as little endian, so you shouldn't reverse it. – auselen Dec 05 '13 at 13:30
  • auselen is correct here, `a000f7f0 strdge pc, [r0], -r0`, `f7f0a000 undefined instruction 0xf7f0a000`. Please fix your answer to ".word 0xf7f0a000\n". – domen Sep 04 '14 at 12:15
  • Using `.inst` looks like a even better. https://sourceware.org/binutils/docs-2.25/as/ARM-Directives.html#ARM-Directives – auselen Jan 13 '15 at 11:06
  • Sorry guys, now fixed. Nudged on by @Patater. – unixsmurf Jun 30 '16 at 19:25
  • It works in thumb mode. Is there any way to trap it for a user space program? – Zibri Mar 26 '18 at 15:13
9

Some extra info...

One of GCC's builtins is

void __builtin_trap (void)

This function causes the program to exit abnormally. GCC implements this function by using a target-dependent mechanism (such as intentionally executing an illegal instruction) or by calling abort. The mechanism used may vary from release to release so you should not rely on any particular implementation.

Its implementation for ARMv7 is:

(define_insn "trap"
  [(trap_if (const_int 1) (const_int 0))]
  ""
  "*
  if (TARGET_ARM)
    return \".inst\\t0xe7f000f0\";
  else
    return \".inst\\t0xdeff\";
  "
  [(set (attr "length")
    (if_then_else (eq_attr "is_thumb" "yes")
              (const_int 2)
              (const_int 4)))
   (set_attr "type" "trap")
   (set_attr "conds" "unconditional")]
)

So for ARM mode gcc will generate 0x7f000f0 (f0 00 f0 07)and for other modes 0xdeff (ff de) (comes handy when disassembling / debugging).

Also note that:

these encodings match the UDF instruction that is defined in the most
recent edition of the ARM architecture reference manual.

Thumb: 0xde00 | imm8  (we chose 0xff for the imm8)
ARM: 0xe7f000f0 | (imm12 << 8) | imm4  (we chose to use 0 for both imms)

For LLVM __builtin_trap values generated are 0xe7ffdefe and 0xdefe:

case ARM::TRAP: {
  // Non-Darwin binutils don't yet support the "trap" mnemonic.
  // FIXME: Remove this special case when they do.
  if (!Subtarget->isTargetDarwin()) {
    //.long 0xe7ffdefe @ trap
    uint32_t Val = 0xe7ffdefeUL;
    OutStreamer.AddComment("trap");
    OutStreamer.EmitIntValue(Val, 4);
    return;
  }
  break;
}
case ARM::tTRAP: {
  // Non-Darwin binutils don't yet support the "trap" mnemonic.
  // FIXME: Remove this special case when they do.
  if (!Subtarget->isTargetDarwin()) {
    //.short 57086 @ trap
    uint16_t Val = 0xdefe;
    OutStreamer.AddComment("trap");
    OutStreamer.EmitIntValue(Val, 2);
    return;
  }
  break;
}
auselen
  • 27,577
  • 7
  • 73
  • 114
  • 1
    Beware that `__builtin_trap()` is treated as a `noreturn` function, so if you're planning to use it like a breakpoint and resume execution at the instruction after, that's not going to work. The compiler won't have generated instructions for later C statements on that path of execution. But yes, for other use cases it's exactly what you want. – Peter Cordes May 21 '19 at 10:45
9

There are different official undefined instructions in [1] (look for UDF; . below can be replaced with any hex digit):

  • 0xe7f...f. - ARM or A1 encoding (ARMv4T, ARMv5T, ARMv6, ARMv7, ARMv8)
  • 0xde.. - Thumb or T1 encoding (ARMv4T, ARMv5T, ARMv6, ARMv7, ARMv8)
  • 0xf7f.a... - Thumb2 or T2 encoding (ARMv6T2, ARMv7, ARMv8)
  • 0x0000.... - ARMv8-A permanently undefined

There are others, but these are probably as future-proof as it gets.

As for generating the code that triggers it, you can just use .short or .word, like this:

  • asm volatile (".short 0xde00\n"); /* thumb illegal instruction */
  • asm volatile (".word 0xe7f000f0\n"); /* arm illegal instruction */
  • asm volatile (".word 0xe7f0def0\n"); /* arm+thumb illegal instruction */

Note that the last one will decode to 0xdef0, 0xe7f0 in thumb context, and therefore also cause an undefined instruction exception.

[1] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.subset.architecture.reference/index.html

[2] DDI0487D_b_armv8_arm.pdf

domen
  • 1,819
  • 12
  • 19
  • Does any flavour of ARM assembly language have mnemonics for any of these, the way x86 has [`UD2`](http://www.felixcloutier.com/x86/UD2.html)? – Peter Cordes Nov 25 '16 at 03:36
  • 1
    ARM documents reference `UDF`, but I don't know if that's recognised by any assemblers. gcc/gas does not recognise it. – domen Nov 25 '16 at 09:35
  • And for ARMv8 / aarch64 / arm64, there are `BRK`, `HLT`, `HVC`, `SMC` and `SVC` documented in *C3.1.4 Exception generation and return* of DDI0487A (ARM ARM). Not quite like `permanently undefined`, but could be useful. – domen Jan 25 '17 at 14:54
  • 2
    @PeterCordes GNU `as` as of Binutils 2.29.51.20170811 does assemble `udf 0x1234` into `f4 23 f1 e7` (in unified syntax, at least). – Ruslan Aug 23 '17 at 20:31
  • 1
    @domen dosen't ARMv8 A64 also describe an UDF instruction at encoding 0000XXXX? I don't understand why GNU GAS 2.29 not have a mnemonic for it however. – Ciro Santilli May 20 '19 at 15:55
  • @CiroSantilli I've updated the answer. I can't find a reference with a quick search, but wouldn't ARMv8 A64 be backwards compatible with A1 encoding as well? – domen May 21 '19 at 07:17
  • Now I understood why it is not in GAS: the A64 one only appears on a very recent version of the ARMARM: https://static.docs.arm.com/ddi0487/db/DDI0487D_b_armv8_arm.pdf but not in https://static.docs.arm.com/ddi0487/ca/DDI0487C_a_armv8_arm.pdf I don't know in general, but in this case the A64 encoding for UDF != A32 ones. – Ciro Santilli May 21 '19 at 08:38
7

The thumb-16 version of the permanently undefined instruction is 0xDExx. So you can do this in your code to trigger the exception:

.inst 0xde00

Reference: ARMv7-M Architecture Reference Manual, section A5.2.6.

(Note that 0xF7Fx, 0xAxxx encoding is also permanently undefined, but is a 32-bit instruction.)

JP Sugarbroad
  • 116
  • 1
  • 4