1

Is there any instruction decoder for the ARM v7-M Instruction Set, that I can just give it an opcode as parameter and return me with the corresponding instruction type?

For example: MOV Rd, <op2>

has four different versions depending on the <op2>.

  1. A simple register specifier, for example Rm.
  2. An immediate shifted register, for example Rm, LSL #4.
  3. A register shifted register, for example Rm, LSL Rs.
  4. An immediate value, for example #0xE000E000.

I want to know which of this version it is from the opcode of the instruction?


Editors note: opcode means the actual 'hex' instruction or machine encoding. The question is about the assembler mnemonic. On many CPUs a leading mnemonic may map to different opcodes (or machine instructions) depending on the arguments.

artless noise
  • 21,212
  • 6
  • 68
  • 105
Kyriakos
  • 757
  • 8
  • 23

1 Answers1

5

Yes, it's called a disassembler. Put the op code in an assembly file, build it, and then disassemble it.

$ cat in.s
    .syntax unified
    .align  2
    .code   16
    .globl  _foo
    .thumb_func _foo
_foo:
    .short  0x4615
    .long   0x43e0e92d

$ clang -arch armv7m -c in.s
$ otool -arch armv7m -tv in.o
in.o:
(__TEXT,__text) section
_foo:
00000000        4615    mov r5, r2
00000002    e92d43e0    push.w  {r5, r6, r7, r8, r9, lr}
Variable Length Coder
  • 7,958
  • 2
  • 25
  • 29
  • Thank you for your answer but that's not what I am looking for. For example: MOV Rd, has four different versions depending on the . a) A simple register specifier, for example Rm. b) An immediate shifted register, for example Rm, LSL #4. c) A register shifted register, for example Rm, LSL Rs. d) An immediate value, for example #0xE000E000. I want to know which of this version it is without need to dissemble the file and parse it to figure out. If there is no tool, I guess I will need to parse it and figure out then – Kyriakos Dec 16 '14 at 16:52
  • 1
    @kkk This advice still applies. Instead of using `.short` and `.long`, write your actual code and use `objdump` to see the opcode chosen by the assembler. Another method is to generate a listing file. For your `mov` example, the op-code chosen will be apparent. An assembler will not choose an `lsl #0` option (if it exists) for instance. Forensics may use this choice to determine what compiler/assembler was used for something. Some instructions are ambiguous or have multiple encodings. A *mnemonic* may even be a *pseudo-instruction* and map to multiple op-codes. – artless noise Dec 17 '14 at 17:02