0

I am writing some firmware code for an ARM Cortex-M0 microcontroller (specifically, the STM32F072B as part of the STM32 Discovery dev board).

My linker script does not do anything special, it just fills out the vector table and then includes all the text and data sections from my code:

OUTPUT_FORMAT("elf32-littlearm")

MEMORY {
    ROM (rx) : ORIGIN = 0x00000000, LENGTH = 16K
    FLASH (r) : ORIGIN = 0x08000000, LENGTH = 64K
    RAM (rw)  : ORIGIN = 0x20000000, LENGTH = 16K
}

ENTRY(_start)

PROVIDE(__stack_top = ORIGIN(RAM) + LENGTH(RAM));


SECTIONS {

    .vector_table : {
        LONG(__stack_top);                  /* 00 */
        LONG(_start);                       /* 04 */  
        LONG(dummy_isr);                    /* 08 */
        LONG(dummy_isr);                    /* 0C */
        LONG(dummy_isr);                    /* 10 */
        LONG(dummy_isr);                    /* 14 */
        LONG(dummy_isr);                    /* 18 */
        LONG(dummy_isr);                    /* 1C */
        LONG(dummy_isr);                    /* 20 */
        LONG(dummy_isr);                    /* 24 */
        LONG(dummy_isr);                    /* 28 */
        LONG(dummy_isr);                    /* 2C */
        LONG(dummy_isr);                    /* 30 */
        LONG(dummy_isr);                    /* 34 */
        LONG(dummy_isr);                    /* 38 */
        LONG(dummy_isr);                    /* 3C */
        LONG(dummy_isr);                    /* 40 */
        LONG(dummy_isr);                    /* 44 */
        LONG(dummy_isr);                    /* 48 */
        LONG(dummy_isr);                    /* 4C */
        LONG(dummy_isr);                    /* 50 */
        LONG(dummy_isr);                    /* 54 */
        LONG(dummy_isr);                    /* 58 */
        LONG(dummy_isr);                    /* 5C */
        LONG(dummy_isr);                    /* 60 */
        LONG(dummy_isr);                    /* 64 */
        LONG(dummy_isr);                    /* 68 */
        LONG(dummy_isr);                    /* 6C */
        LONG(dummy_isr);                    /* 70 */
        LONG(dummy_isr);                    /* 74 */
        LONG(dummy_isr);                    /* 78 */
        LONG(dummy_isr);                    /* 7C */
        LONG(dummy_isr);                    /* 80 */
        LONG(dummy_isr);                    /* 84 */
        LONG(dummy_isr);                    /* 88 */
        LONG(dummy_isr);                    /* 8C */
        LONG(dummy_isr);                    /* 90 */
        LONG(dummy_isr);                    /* 94 */
        LONG(dummy_isr);                    /* 98 */
        LONG(dummy_isr);                    /* 9C */
        LONG(dummy_isr);                    /* A0 */
        LONG(dummy_isr);                    /* A4 */
        LONG(dummy_isr);                    /* A8 */
        LONG(dummy_isr);                    /* AC */
        LONG(dummy_isr);                    /* B0 */
        LONG(dummy_isr);                    /* B4 */
        LONG(dummy_isr);                    /* B8 */
        LONG(dummy_isr);                    /* BC */
    } > ROM AT > FLASH
    
    .text : {
        *(.text*)
    } > ROM AT > FLASH

    .rodata : {
        *(.rodata*)
        *(.data.rel.ro)
    } > FLASH

    .bss (NOLOAD) : {
        *(.bss*)
        *(COMMON)
    } > RAM

    .data : {
        *(.data*)
    } > RAM

    .ARM.exidx : {
       *(.ARM.exidx)
    } > FLASH

}

When I build and link an ELF file and dump the symbols, I notice that the addresses that end up in the .vector_table section, as well as the ELF entry point, are all off by one:

[shell]$ llvm-objdump --syms zig-cache/bin/main-flash 

zig-cache/bin/main-flash:       file format elf32-littlearm

SYMBOL TABLE:
00000000 l    df *ABS*  00000000 main-flash
0000013c l       .text  00000000 $d.1
000000c0 l       .text  00000000 $t.0
000000c4 g     F .text  00000088 _start
000000c0 g     F .text  00000002 dummy_isr
20004000 g       *ABS*  00000000 __stack_top

[shell]$ llvm-objdump --full-contents --section=.vector_table zig-cache/bin/main-flash 

zig-cache/bin/main-flash:       file format elf32-littlearm

Contents of section .vector_table:
 0000 00400020 c5000000 c1000000 c1000000  .@. ............
 0010 c1000000 c1000000 c1000000 c1000000  ................
 0020 c1000000 c1000000 c1000000 c1000000  ................
 0030 c1000000 c1000000 c1000000 c1000000  ................
 0040 c1000000 c1000000 c1000000 c1000000  ................
 0050 c1000000 c1000000 c1000000 c1000000  ................
 0060 c1000000 c1000000 c1000000 c1000000  ................
 0070 c1000000 c1000000 c1000000 c1000000  ................
 0080 c1000000 c1000000 c1000000 c1000000  ................
 0090 c1000000 c1000000 c1000000 c1000000  ................
 00a0 c1000000 c1000000 c1000000 c1000000  ................
 00b0 c1000000 c1000000 c1000000 c1000000  ................
[shell]$ readelf -h zig-cache/bin/main-flash 
ELF Header:
...
  Entry point address:               0xc5

The symbol table shows _start at 0xC4, while the ELF entry point, which is defined in the linker script to be _start, is set to 0xC5. Similarly, the address of dummy_isr written into the vector table is also off-by-one (the dummy_isr symbol is defined as 0xC0, while 0xC1 is written by the linker into the vector table). The disassembly of .text confirms that _dummy_isr and _start begin at 0xC0 and 0xC4, respectively, so the address that the linker is writing is wrong:

[shell]$ llvm-objdump --disassemble --section=.text zig-cache/bin/main-flash 
                                                               
zig-cache/bin/main-flash:       file format elf32-littlearm
                                                               

Disassembly of section .text:
                                                               
000000c0 <dummy_isr>:                                          
      c0: fe e7         b       #-4 <dummy_isr>   
      c2: c0 46         mov     r8, r8            
                               
000000c4 <_start>:
      c4: 82 b0         sub     sp, #8
      c6: 01 23         movs    r3, #1
      c8: d8 04         lsls    r0, r3, #19
      ca: 1c 49         ldr     r1, [pc, #112]
...

0xC1 and 0xC5 are not even the addresses of valid instructions, they are each in the middle of an instruction. What could cause this discrepancy?

test
  • 11
  • look at the arm documentation, the vectors need to be at handler address orred with one. (lsbit set, indicating this is a thumb function) – old_timer Mar 21 '21 at 01:18
  • if the lsbit is not set then you will get a fault – old_timer Mar 21 '21 at 01:18
  • Does this answer your question? [Why this function does point to itself with a offset of 1?](https://stackoverflow.com/questions/65884094/why-this-function-does-point-to-itself-with-a-offset-of-1) – Tagli Mar 21 '21 at 05:10
  • "All other entries must have bit[0] set to 1, because this bit defines the EPSR.T bit on exception entry." – old_timer Mar 21 '21 at 15:16
  • "On exception entry, if bit [0] of the associated vector table entry is 0, execution of the first instruction causes a HardFault." – old_timer Mar 21 '21 at 15:16
  • You dont want to use that ROM AT > FLASH as the 0x00000000 based mirror does not fully map the flash on these parts. You want to expressly build for 0x08000000. and end up with 0x080000C1 etc addresses. – old_timer Mar 21 '21 at 15:19
  • You will also find that some tools for these parts will not load 0x00000000 based binaries they want to see the 0x08000000 or 0x0020000 based binaries (they examine the vector table and refuse to load the binary into the target mcu). – old_timer Mar 21 '21 at 15:21

1 Answers1

2

This is called an "interworking address".

The least significant bit of the address indicates whether the target instruction is ARM (0) or Thumb (1). The address fetched always has the LSB set to zero.

Since this platform only works in Thumb mode, all vector addresses and addresses used with the BX and BLX instruction must be odd (the X means (ex)change instruction set).

Tom V
  • 4,827
  • 2
  • 5
  • 22
  • this is not interworking this is exception handling, similar same reason but bx and blx are not used in this context. – old_timer Mar 21 '21 at 15:15
  • I did not say that BX or BLX is used, I said that the purpose of the LSB is the same. Any address on ARM where the LSB indicates the the processor mode for the target instruction is named an "interworking address". – Tom V Mar 21 '21 at 19:06