Where do addresses in S-Record files come from?

Question

I am developing a freestanding application for an ARM Cortex-M microcontroller and while researching the structure of an S-Record file I found that I have some kind of misunderstanding in how the addresses are represented in the S-Record format.

I have a variable defined in my source code like so:

uint32_t g_ip_address = IP_ADDRESS(10, 1, 0, 56); // in LE: 0x3800010A

When I run objdump I see that the variable ends up in the .data section at address 0x1ffe01c4:

$ arm-none-eabi-objdump -t application.elf | grep g_ip_address
1ffe01c4 g     O .data  00000004 g_ip_address

This makes sense, given that the memory section of my linker script looks like this and .data is going to RAM:

MEMORY
{
  FLASH (rx)         : ORIGIN = 0x00000000, LENGTH = 0x0200000  /*   2M */
  RAM (rwx)          : ORIGIN = 0x1FFE0000, LENGTH = 0x00A0000  /* 640K */   
}

However, when I check through the srec file, I'm finding that the address for the record is not 0x1FFE0000. It's 0x0005F570, which seems to put it in the FLASH section (spaces added for clarity).

S315 0005F570 00000000 3800010A 000010180000000014

Is there an implicit offset encoded in a different record entry? How does objcopy get this new address? If this value is being encoded into a function in some way (some pre-main initialization of variables perhaps)?

Ultimately, my goal is to be able to parse the srec file and patch the IP address value to create a new srec file. Is the idiomatic way of doing something like this simply to create a struct that hardcodes some leading magic number sequence that can be detected in the file?

the srecord format is documented at wikipedia. with s3 records the full address is in each line — old_timer, Sep 17 '21 at 00:02
for an mcu there should be no ram defined in the binary, that would be a pretty serious bug in your build. any non-zero ram initialization would be in the flash space and then copied to ram in the bootstrap. write a simple asm program, less than 10 lines will do it, then look at how it lands in the image. — old_timer, Sep 17 '21 at 00:05
also look at the sections part of the linker script the answers are there — old_timer, Sep 17 '21 at 00:05
you can also use readelf to look at the binary to find out about the results of the link — old_timer, Sep 17 '21 at 00:27
Oh of course, I got so far down my line of thought that I forgot to question my basic assumptions. Is there any way to know concretely where to look in the srec file for the value, assuming I have the elf file (short of disassembling the elf file and finding the instructions that set the relevant address in RAM)? — Shane Snover, Sep 17 '21 at 01:47

score 2 · Accepted Answer · answered Sep 17 '21 at 02:47

flash.s

.cpu cortex-m0
.thumb

.word 0x00002000
.word reset

.thumb_func
reset:
    b reset
    
.data
.word 0x11223344

.bss
.word 0x00000000
.word 0x00000000

flash.ld

MEMORY
{
    rom : ORIGIN = 0x08000000, LENGTH = 0x1000
    ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
    .text   : { *(.text*)   } > rom
    .bss    : { *(.bss*)    } > ram AT > rom
    .data   : { *(.data*)   } > ram AT > rom
}

build it

arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 flash.s -o flash.o
arm-none-eabi-ld -nostdlib -nostartfiles -T flash.ld flash.o -o so.elf
arm-none-eabi-objdump -D so.elf > so.list
arm-none-eabi-objcopy --srec-forceS3 so.elf -O srec so.srec
arm-none-eabi-objcopy -O binary so.elf so.bin

cat so.list

08000000 <reset-0x8>:
 8000000:   00002000    andeq   r2, r0, r0
 8000004:   08000009    stmdaeq r0, {r0, r3}

08000008 <reset>:
 8000008:   e7fe        b.n 8000008 <reset>

Disassembly of section .bss:

20000000 <.bss>:
    ...

Disassembly of section .data:

20000008 <.data>:
20000008:   11223344            ; <UNDEFINED> instruction: 0x11223344

cat so.srec

S00A0000736F2E7372656338
S30F080000000020000009000008FEE7D2
S3090800000A443322113A
S70508000000F2

arm-none-eabi-readelf -l so.elf

Elf file type is EXEC (Executable file)
Entry point 0x8000000
There are 3 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000094 0x08000000 0x08000000 0x0000a 0x0000a R E 0x2
  LOAD           0x000000 0x20000000 0x0800000a 0x00000 0x00008 RW  0x1
  LOAD           0x00009e 0x20000008 0x0800000a 0x00004 0x00004 RW  0x1

 Section to Segment mapping:
  Segment Sections...
   00     .text 
   01     .bss 
   02     .data

hexdump -C so.bin

00000000  00 20 00 00 09 00 00 08  fe e7 44 33 22 11        |. ........D3".|
0000000e

bss is not normally exposed as is, you complicate your linker script to add beginning and end points so you can then zero that range in your bootstrap. For .data you can clearly see what is going on with the standard binutils tools.

You have not provided enough of your code (and linker script), nor a minimal example that demonstrates the problem, so this is about as far as this can go.

Ok, thank you! Here's my understanding: readelf gives the physical and virtual address of the .data segment. By subtracting the virtual address from that of my symbol (`g_ip_address`) I get an offset which I can then apply to the physical address. The physical address is what is shown in the srec file and so I can use that to search for the proper record entry. — Shane Snover, Sep 17 '21 at 03:09
In my case, for .data the VirtAddr and PhysAddr are 0x1ffe01c0 and 0x0005f570 respectively. The VirtAddr of g_ip_address is 0x1ffe01c4 giving an offset of 4. This means the PhysAddr of g_ip_address is 0x0005f574. The correct entry then has the address I showed 0x0005f570 and indeed there are 4 bytes pre-pending before the byte pattern I found in that record. — Shane Snover, Sep 17 '21 at 03:13
I would not make any assumptions about subtracting one offset from another. from a system level you understand why the binary cannot have .data in a ram based loadable section yes? for an mcu? — old_timer, Sep 17 '21 at 12:16
if you want to know where the linker is putting something and you have access to the elf, then create a variable in the linker script and use that variable (which you can see with tools like nm). or you can put that variable in flash at a known offset (in .text) and then use the hardcoded offset and then you can go after the binary in whatever form if you want to change it post-compile. or control its offset/address in ram for a change at runtime — old_timer, Sep 17 '21 at 12:18
if you want to do bare metal, then IMO you need to understand the basics of the tools and the boot process. gnus compiler may be average, but binutils provides a number of tools to make life considerably easier to understand what is going on with the toolchain. — old_timer, Sep 17 '21 at 12:22
if you control where things land in .data then you can use the virtual or physical address plus the fixed offset you have forced into .data to find something. (just like you can force it in .text) — old_timer, Sep 17 '21 at 12:24
title question, where do the addresses come from, they come from the loadable sections in the binary — old_timer, Sep 17 '21 at 15:39

Where do addresses in S-Record files come from?

1 Answers1