0

I have an issue with an ELF file generated by the GNU linker ld.

The result is that the data section (.data) gets corrupted when the executable is loaded into memory. The corruption to the .data section occurs when the loader performs the relocation on the .eh_frame section using the relocation data (.rela.eh_frame).

What happens is that this relocation causes seven writes that are beyond the .eh_frame section and over-write the correct contents of the .data section which is adjacent to the top of the .eh_frame section.

After some investigation, I believe the loader is behaving correctly, but the ELF file it has been given contains an error.

But I could be wrong and wanted to check what I've found so far.

Using readelf on the ELF file, it can be seen that seven of the entries in the .rela.eh_frame section contain offsets that are outside (above) the range given by readelf for the .eh_frame section. ie The seven offsets in .rela.eh_frame are greater than the length given for .eh_frame. When these seven offsets are applied in the relocation, they corrupt the .data section.

So my questions are:

(1) Is my deduction that relocation offsets should not be greater than the length of the section to which they apply? And therefore the ELF file that has been generated is in error?

(2) What are people's opinions on the best way of proceeding to diagnose the cause of the incorrect ELF file? Are there any options to ld that will help, or any options that will remove/fix the .eh_frame and it's relocation counterpart .rela.eh_frame?

(3) How would I discover what linker script is being used when the ELF file is generated?

(4) Is there a specific forum where I might find a whole pile of linker experts who would be able to help. I appreciate this is a highly technical question and that many people may not have a clue what I'm talking about!

Thanks for any help!

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
HoPpY
  • 1

1 Answers1

1

The .eh_frame section is not supposed to have any run-time relocations. All offsets are fixed when the link editor is run (because the object layout is completely known at this point) and the ET_EXEC or ET_DYN object is created. Only ET_REL objects have relocations in that section, and those are never seen by the dynamic linker. So something odd most be going on.

You can ask such questions on the binutils list or the libc-help list (if you use the GNU toolchain).

EDIT It seems that you are using a toolchain configured for ZCX exceptions with a target which expects SJLJ exceptions. AdaCore has some documentation about his:

It doesn't quite say how t switch to the SJLJ-based VxWorks 5 toolchain. It is definitely not a matter of using the correct linker script. The choice of exception handling style affects code generation, too.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • Interesting! Does your answer apply to Vxworks systems which is what I'm using here. The relocations are not at runtime, they are when a partially linked ELF gets loaded by calling VxWorks "loadModule". We do a "partial link" (using the GNU tool chain), which generates the suspect ELF file. You then the load it onto a running VxWorks system by calling "loadModule". It's "loadModule" that's doing the relocations. There must be some relocations here because the partially linked ELF contains calls to VxWorks etc. and those references cannot be resolved during the partial link... – HoPpY Jan 24 '18 at 22:51
  • VxWorks isn't a target I'm familiar with. However, some documentation suggests that it uses SJLJ-style exception handling, and not unwinding information in the `.eh_frame` section. What's your exact VxWorks version and target? Your C++ compiler might be set up incorrectly. – Florian Weimer Jan 25 '18 at 08:07
  • An ancient VxWorks 5.5.1 and PPC405. We're not using C++, we're using Ada. You could well be right, this could be a compiler/linker/linker script set up issue. We're using GnatPro on Windows to build the ELF file, and I assume you'd need the linker scripts set up right in order to target VxWorks 5.5.1. How can I find what linker script LD is using? – HoPpY Jan 25 '18 at 10:46
  • This is not a linker script issue. You need to switch to SJLJ-style exception handling. – Florian Weimer Jan 25 '18 at 20:40
  • It is a linker issue. Rather than produce an error, the linker has produced a ELF that results in a corrupted data section on loading (invalid relocation offsets). If i can write a five line script that can detect this error, then so can the linker! What's more we've now discovered that when we remove the -gc-sections option then the linker does not generate the garbage relocation data. We've sent a very simple reproducer to AdaCore. I accept there *may* *also* be a toolchain issue regarding exceptions but it doesn't help when the linker produces duff ELFs which cause undefined behaviour. – HoPpY Jan 26 '18 at 22:40