0

I'm learning RISCV(64) and implementing a simulator. I compile the following simple code and disassemble the generated elf with riscv64-unknown-linux-gnu-objdump. I have two problems totally.

int main() {
  return 0;
}
  1. If I use riscv64-unknown-linux-gnu-objdump -d ./xxx.elf, the asm codes are like this.
Disassembly of section .text:

0000000080000000 <_start>:
    80000000:   00000413                li      s0,0
    80000004:   00009117                auipc   sp,0x9
    80000008:   ffc10113                add     sp,sp,-4 # 80009000 <_end>
    8000000c:   00c000ef                jal     80000018 <_trm_init>

0000000080000010 <main>:
    80000010:   00000513                li      a0,0
    80000014:   00008067                ret

0000000080000018 <_trm_init>:
    80000018:   ff010113                add     sp,sp,-16
    8000001c:   00000517                auipc   a0,0x0
    80000020:   01c50513                add     a0,a0,28 # 80000038 <_etext>
    80000024:   00113423                sd      ra,8(sp)
    80000028:   fe9ff0ef                jal     80000010 <main>
    8000002c:   00050513                mv      a0,a0
    80000030:   00100073                ebreak
    80000034:   0000006f                j       80000034 <_trm_init+0x1c>

If -M no-aliases option is added, the asm codes are like this.

Disassembly of section .text:

0000000080000000 <_start>:
    80000000:   00000413                addi    s0,zero,0
    80000004:   00009117                auipc   sp,0x9
    80000008:   ffc10113                addi    sp,sp,-4 # 80009000 <_end>
    8000000c:   00c000ef                jal     ra,80000018 <_trm_init>

0000000080000010 <main>:
    80000010:   00000513                addi    a0,zero,0
    80000014:   00008067                jalr    zero,0(ra)

0000000080000018 <_trm_init>:
    80000018:   ff010113                addi    sp,sp,-16
    8000001c:   00000517                auipc   a0,0x0
    80000020:   01c50513                addi    a0,a0,28 # 80000038 <_etext>
    80000024:   00113423                sd      ra,8(sp)
    80000028:   fe9ff0ef                jal     ra,80000010 <main>
    8000002c:   00050513                addi    a0,a0,0
    80000030:   00100073                ebreak
    80000034:   0000006f                jal     zero,80000034 <_trm_init+0x1c>

Therefore, addi and add are all disassembled as add instruction. From the RISCV spec, I know li, j, ret, mv these are pseudo-instructions. But addi and add are all real instrucions, why are they all translated into add? Is alias-instruction different with pseudo-instruction? Is there a standard manual about this alias-instruction? Even if it is tool-related, I want to see the accurate statements.

  1. See the second piece of asm codes above, the immediate address offset of jal instrcution(like jal ra,80000018) is absolute address rather than pc-relative address. So How to make riscv-objdump use pc-relative address rather than absolute address?

My Environment:
Machine: x86_64
OS: Ubuntu 22.04(VMWare Virtual Machine)
riscv64-unknown-linux-gnu-objdump -v: GNU objdump (GNU Binutils) 2.40.0.20230214

Thanks in advance!

beyond
  • 11
  • 2
  • The relative offset is part of the machine code. Not always easy to pick out by eye, though, since RISC-V splits up immediates into chunks distributed around the instruction word. In this case, `j 80000034` is a jump to itself, as we can see from the target address being its own address, and the left 24 bits (first 6 hex digits) all being `0`. UJ-type instructions keep their immediate there, with the destination register and opcode in the low 12 bits (https://inst.eecs.berkeley.edu/~cs61c/resources/su18_lec/Lecture7.pdf). So the relative displacement is `0`. – Peter Cordes Jul 16 '23 at 18:14
  • In general IDK if `objdump` has an option to separate out the relative displacements in the disassembly. It might not, since there's no GAS syntax for writing asm source that way, only with labels or absolute addresses as targets. – Peter Cordes Jul 16 '23 at 18:16
  • All right, I just think it's clear to extract the immediates offset according the spec by hand, so I subjectively think there must be a way to do this. – beyond Jul 17 '23 at 03:32
  • Usually when people read disassembly, they want to know where the jump actually goes, that's the use-case `objdump` is designed around. I don't think `objdump` has an option to make it work as a machine-code analyzer which just shows the fields like you're looking for. Certainly it could, but I didn't find one in the man page either (There is `--adjust-vma=offset` to add an offset to every absolute address it prints.) Feel free to add such an option; GNU Binutils is open-source; the maintainers might even accept a patch to add it. – Peter Cordes Jul 17 '23 at 04:00
  • I got it. And you know the answer of the 1st question? Thanks! – beyond Jul 17 '23 at 04:25
  • Why does `objdump` by default disassemble the add-immediate opcode as `add` instead of a separate `addi` mnemonic? Seems a fairly arbitrary choice, but some other ISAs like x86 and ARM don't use different mnemonics for add-immediate vs. add-register, so it makes them more like that. The extra letter in the mnemonic is for many purposes just visual noise; you just look at the operands to see what's being added and it's usually immediately obvious (no pun intended) whether it's a constant or a register. – Peter Cordes Jul 17 '23 at 04:30
  • Maybe it's more to do with accepting that as input: if you want to change an instruction from adding a constant to adding a runtime variable (or a larger constant that needed to get loaded into a register earlier), you can just change the operand without also changing the mnemonic. `objdump` is part of GNU Binutils which includes `as` the assembler. It could have been designed to still accept `add x1, x2, 123` without that being the default for disassembly, but I think that implies that Binutils developers think it's normal to write `add x1, x2, 123` rather than `addi`. – Peter Cordes Jul 17 '23 at 04:33
  • So it's just more like a conventional rule. Maybe I'm too serious. Thank you! – beyond Jul 17 '23 at 05:03

0 Answers0