I'm aware this is old, but I came across this problem as well, and figured I'd provide my findings.
TL;DR: It's possible, but incredibly difficult. You can't simply move symbols into their own sections. Relocations will bite you.
When the compiler generates machine code, it will generate slightly different instructions if the -ffunction-sections
and -fdata-sections
flags are, or are not, provided. This is due to assumptions the compiler is able to make about where symbols will be located. These assumptions change depending on the flags provided.
This is best illustrated by example. Take the following very simple code snippet:
int a, b;
int getAPlusB()
{
return a + b;
}
The following is the result of arm-none-eabi-objdump -xdr test.o
:
arm-none-eabi-gcc -c -Os -mthumb -mcpu=cortexm3 -mlittle-endian -o test.o test.c
:
SYMBOL TABLE:
00000000 g F .text 0000000c getAPlusB
00000004 g O .bss 00000004 b
00000000 g O .bss 00000004 a
Disassembly of section .text:
00000024 <getAPlusB>:
24: 4b01 ldr r3, [pc, #4] ; (2c <getAPlusB+0x8>)
26: cb09 ldmia r3, {r0, r3}
28: 4418 add r0, r3
2a: 4770 bx lr
2c: 00000000 .word 0x00000000
2c: R_ARM_ABS32 .bss
arm-none-eabi-gcc -c -Os -ffunction-sections -fdata-sections \
-mthumb -mcpu=cortexm3 -mlittle-endian -o test.o test.c
:
SYMBOL TABLE:
00000000 g F .text.getAPlusB 00000014 getAPlusB
00000000 g O .bss.b 00000004 b
00000000 g O .bss.a 00000004 a
Disassembly of section .text.getAPlusB:
00000000 <getAPlusB>:
0: 4b02 ldr r3, [pc, #8] ; (c <getAPlusB+0xc>)
2: 6818 ldr r0, [r3, #0]
4: 4b02 ldr r3, [pc, #8] ; (10 <getAPlusB+0x10>)
6: 681b ldr r3, [r3, #0]
8: 4418 add r0, r3
a: 4770 bx lr
...
c: R_ARM_ABS32 .bss.a
10: R_ARM_ABS32 .bss.b
The difference is subtle, but important. The flag enabled code performs two separate loads, while the disabled code performs a single "load multiple." The enabled code does this because it knows both symbols are contained in the same section, in a certain sequence. With the enabled code, this is not the case. The symbols are in two separate sections, and while it is likely they will keep their order and proximity, it is not guaranteed. What's more, if both sections are not referenced, the linker may decide one section is not used, and remove it.
Another example:
int a, b;
int getB()
{
return b;
}
And the generated code. First without the flags:
SYMBOL TABLE:
00000000 g F .text 0000000c getB
00000004 g O .bss 00000004 b
00000000 g O .bss 00000004 a
Disassembly of section .text:
00000018 <getB>:
18: 4b01 ldr r3, [pc, #4] ; (20 <getB+0x8>)
1a: 6858 ldr r0, [r3, #4]
1c: 4770 bx lr
1e: bf00 nop
20: 00000000 .word 0x00000000
20: R_ARM_ABS32 .bss
And with the flags:
SYMBOL TABLE:
00000000 g F .text.getB 00000014 getB
00000000 g O .bss.b 00000004 b
00000000 g O .bss.a 00000004 a
Disassembly of section .text.getB:
00000000 <getB>:
0: 4b01 ldr r3, [pc, #4] ; (8 <getB+0x8>)
2: 6818 ldr r0, [r3, #0]
4: 4770 bx lr
6: bf00 nop
8: 00000000 .word 0x00000000
8: R_ARM_ABS32 .bss.b
In this case, the difference is even more subtle. The enabled code loads with an offset of 0, while the disabled code uses 4. Since the disabled code references the beginning of the section, it needs to offset to the location of b
. However the enabled code references the section which contains solely b
, and therefore does not need an offset. If we were to split this and only change the relocation, the new code would contain a reference to the section a
was in, but not b
. This, again, could cause the linker to garbage collect the wrong section.
These were just two scenarios that I came across when looking at this problem, there may be more.
Producing valid object files functionally equivalent to code compiled with the -ffunction-sections
and -fdata-sections
flags would require parsing the machine instructions looking for these and any other relocation issues that could come up. This is not an easy task to accomplish.