2

I'm currently working on an emulated CPU (which does not exist in reality) and no tools exist for that type of CPU.

The idea is to write an assembler which outputs "assembly" files only containing .byte lines so I can use the "x86 binutils" tool chain to create the executable (my-as -> x86-as -> ld -> objcopy).

Example:

input.s:
   instx r1, 0x12
   insty r2, 0x34
   instz r3, 0x5678

output.s:
   .byte 0xa1, 0x12         # instx r1, 0x12
   .byte 0xb2, 0x34         # insty r2, 0x34
   .byte 0xc3, 0x56, 0x78   # instz r3, 0x5678

My problem are the relocations. Example:

input.s:
    instxy r4, symbol

output.s:
    .byte 0xd4, (((symbol&0xF000)>>8)|((symbol&0xF0000)>>16))

... but "symbol" is some symbol defined in another object file or a label whose address is not known before link time.

There seem to be other developers having more or less the same problem: Non-standard relocations; so at least I'm not the first person in need of this feature.

So my question is if the GNU binutils tool chain (or an additional GNU tool) already provides an easy way of defining "user-defined" relocation types?

Maybe there are other object file formats supporting this feature?

I already have an idea how to implement an additional tool adding this capability to the existing "binutils" tool chain but I want to avoid spending time implementing a new tool when this feature is already implemented.

EDIT

As it seems that such a feature does not exist, yet, I implemented my solution and published it on SourceForge - just for the case someone else needs such a feature.

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
  • If you use ELF, you can define your own machine type with custom relocations. However, you have to patch the libbfd to support your new architecture. – fuz Feb 13 '18 at 19:02
  • I don't think support exists, especially not in x86. Are there architectures that support separate hi/lo bytes of addresses already? e.g. AVR? I know some non-x86 formats support relocations for high/low halves of 32-bit addresses, for stuff like MIPS `lui $1, %hi(symbol)` / `ori $1, $1, %lo(symbol)`. How custom do you need this to be, in case that matters? Am I missing something, or is that expression just `(symbol >> 8) & 0xFF00`, i.e. the high byte of the symbol with a padding byte of zeros. Why write the two nibbles separately? – Peter Cordes Feb 13 '18 at 19:03
  • @PeterCordes The two nibbles is only an example of a weird type of relocation not supported by any other CPU. The CPU I'm currently working on uses 16 bits per address (so sizeof(uint16)=1) and for some instructions 6 bits of an address must be extracted... – Martin Rosenau Feb 13 '18 at 20:39
  • But the way you wrote it, both halves of one byte end up together, so the expression you wrote simplifies to `(symbol >> 8) & 0xFF00`, or `(symbol & 0xFF000) >> 8`. I was asking if you meant to write something even more tricky, like a nibble-swap (or if I made an error when simplifying) – Peter Cordes Feb 13 '18 at 23:33
  • 1
    @PeterCordes Maybe I did something wrong. But I wanted to show a relocation where the value 0xABCDE ends up in 0xBA (so high and low bits are even reversed). The PowerPC VLE instruction set would be one example where such relocations already exist. – Martin Rosenau Feb 14 '18 at 06:50
  • I think I got it wrong. Shift counts of 8 and 16 are 2 and 4 nibbles / hex digits, not 1 and 2. derp. – Peter Cordes Feb 14 '18 at 07:27

0 Answers0