3

I have a binary firmware image for ARM Cortex M that I know should be loaded at 0x20000000. I would like to convert it to a format that I can use for assembly level debugging with gdb, which I assume means converting to an .elf. But I have not been able to figure out how to add enough metadata to the .elf for this to happen. Here is what I've tried so far.

arm-none-eabi-objcopy -I binary -O elf32-littlearm --set-section-flags \
    .data=alloc,contents,load,readonly \
    --change-section-address .data=0x20000000 efr32.bin efr32.elf

efr32.elf:     file format elf32-little
efr32.elf
architecture: UNKNOWN!, flags 0x00000010:
HAS_SYMS
start address 0x00000000

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         00000168  20000000  20000000  00000034  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
20000000 l    d  .data  00000000 .data
20000000 g       .data  00000000 _binary_efr32_bin_start
20000168 g       .data  00000000 _binary_efr32_bin_end
00000168 g       *ABS*  00000000 _binary_efr32_bin_size

Do I need to start by converting the binary to .o and write a simple linker script? Should I add an architecture option to the objcopy command?

joeforker
  • 40,459
  • 37
  • 151
  • 246
  • 1
    there are objcopy ways to do this, but you need a fixed length instruction set, thumb without thumb2 (although that probably wont work with gnu), arm without thumb, mips 32 bit only (without 16 bit instructions), not x86, not a number of others. – old_timer Sep 01 '17 at 03:59

1 Answers1

3

A little experiment...

  58:   480a        ldr r0, [pc, #40]   ; (84 <spi_write_byte+0x38>)
  5a:   bf08        it  eq
  5c:   4809        ldreq   r0, [pc, #36]   ; (84 <spi_write_byte+0x38>)
  5e:   f04f 01ff   mov.w   r1, #255    ; 0xff

you dont have that of course, but you can read the binary and do this with it:

.thumb
.globl _start
_start:
.inst.n 0x480a
.inst.n 0xbf08
.inst.n 0x4809
.inst.n 0xf04f
.inst.n 0x01ff

then see what happens.

arm-none-eabi-as test.s -o test.o
arm-none-eabi-ld -Ttext=0x58 test.o -o test.elf
arm-none-eabi-objdump -D test.elf

test.elf:     file format elf32-littlearm


Disassembly of section .text:

00000058 <_start>:
  58:   480a        ldr r0, [pc, #40]   ; (84 <_start+0x2c>)
  5a:   bf08        it  eq
  5c:   4809        ldreq   r0, [pc, #36]   ; (84 <_start+0x2c>)
  5e:   f04f 01ff   mov.w   r1, #255    ; 0xff

but the reality is it wont work...if this binary has any thumb2 extensions it isnt going to work, you cant disassemble variable length instructions linearly. You have to deal with them in execution order. So to do this correctly you have to write a dissassembler that walks through the code in execution order, determining the instructions you can figure out, mark them as instructions...

  80:   d1e8        bne.n   54 <spi_write_byte+0x8>
  82:   bd70        pop {r4, r5, r6, pc}
  84:   40005200
  88:   F7FF4000
  8c:   e92d 41f0   stmdb   sp!, {r4, r5, r6, r7, r8, lr}
  90:   4887        ldr r0, [pc, #540]  ; (2b0 <notmain+0x224>)
.thumb
.globl _start
_start:
.inst.n 0xd1e8
.inst.n 0xbd70
.inst.n 0x5200
.inst.n 0x4000
.inst.n 0x4000
.inst.n 0xF7FF
.inst.n 0xe92d
.inst.n 0x41f0
.inst.n 0x4887

  80:   d1e8        bne.n   54 <_start-0x2c>
  82:   bd70        pop {r4, r5, r6, pc}
  84:   5200        strh    r0, [r0, r0]
  86:   4000        ands    r0, r0
  88:   4000        ands    r0, r0
  8a:   f7ff e92d           ; <UNDEFINED> instruction: 0xf7ffe92d
  8e:   41f0        rors    r0, r6
  90:   4887        ldr r0, [pc, #540]  ; (2b0 <_start+0x230>)

it will recover, and break and recover, etc...

instead you have to write a disassembler that walks through the code (doesnt necessarily have to disassemble to assembly language but enough to walk the code and recurse down all possible branch paths). all data not determined to be instructions mark as instructions

.thumb
.globl _start
_start:
.inst.n 0xd1e8
.inst.n 0xbd70
.word 0x40005200
.word 0xF7FF4000
.inst.n 0xe92d
.inst.n 0x41f0
.inst.n 0x4887

00000080 <_start>:
  80:   d1e8        bne.n   54 <_start-0x2c>
  82:   bd70        pop {r4, r5, r6, pc}
  84:   40005200    andmi   r5, r0, r0, lsl #4
  88:   f7ff4000            ; <UNDEFINED> instruction: 0xf7ff4000
  8c:   e92d 41f0   stmdb   sp!, {r4, r5, r6, r7, r8, lr}
  90:   4887        ldr r0, [pc, #540]  ; (2b0 <_start+0x230>)

and our stmdb instruction is now correct.

good luck.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • which objdump options did you use to show the .inst.n lines? – joeforker Sep 01 '17 at 15:05
  • those were created by hand in this case, you would write a tool that reads the binary and creates a file like that. – old_timer Sep 01 '17 at 15:15
  • note that I demonstrated that even though I said it was a thumb instruction inst.n when I put the thumb2 extensions in there instead of one 32 bit inst.w as two 16 bit inst.n the disassembler figured out it was a thumb2... – old_timer Sep 01 '17 at 15:16
  • I dont know what core/instruction set you are using but if it is a cortex-m then armv6-m and arm7-m (and armv-8m) have thumb2 extensions that have to be dealt with possible, but was this binary built using any of them or was it pure thumb? I also left out the vector table, that is pretty simple and can be figured out visually put .words in there for that portion of the binary. – old_timer Sep 01 '17 at 15:18
  • It has some 32-bit branch instructions – joeforker Sep 05 '17 at 19:35
  • doesnt answer the question, which specific chip are you using? – old_timer Sep 05 '17 at 19:45
  • arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb – joeforker Sep 05 '17 at 20:04
  • that is not the chip type that is the core that the chip vendor bought and/or you are using the least common denominator in your build system to make code for any of the cortex-m cores... – old_timer Sep 05 '17 at 20:06
  • doesnt matter at this point the answer still stands. if you have access to gcc for the sources then why are you trying to reverse the binary back to an elf, just save the elf when you build it... – old_timer Sep 05 '17 at 20:08
  • it is the flash stub for the black magic probe debugger, which is expected to generate a short program than can be loaded into RAM without a linker script and then perform the flash programming operation. if upstream will not accept a linker script, an alternative might be to generate the elf at 0x0, use objcopy to relocate the section without fixing offsets, and debug to find the offending instruction. also gdb probably has a way to load and debug raw binaries? – joeforker Sep 06 '17 at 19:14
  • `.word 0x40005200` and `.word 0xF7FF4000` look incorrect. I think that puts the bytes in the data section on big-endian ARM, and not the `.code` section. Does `.inst.w` work? Also see [arm thumb2 ldr.w syntax?](https://sourceware.org/ml/binutils/2011-09/msg00086.html) on the GCC-Help mailing list. – jww May 21 '19 at 00:40