3

I am looking to better understand assembly instructions pertaining to C++. I have written a simple .cc file to try to reverse engineer it, but I am having trouble understanding what is going on. Ultimately, I want to gain more insight into what is executed before main is called in the realm of global variables.

How are y1 and y2 variables initialized? What is the assembly doing?

Here's the code:

#include <iostream>
#include <array>

struct y {int i; int j;};

const y y1{7,2}, y2{6,4};

int k = 9;

int jy = k;

int main() {}

Here's the generated disassembly from objdump -D:

00000000004007e4 <_ZL2y1>:
  4007e4:   07                      (bad)
  4007e5:   00 00                   add    %al,(%rax)
  4007e7:   00 02                   add    %al,(%rdx)
  4007e9:   00 00                   add    %al,(%rax)
    ...

00000000004007ec <_ZL2y2>:
  4007ec:   06                      (bad)
  4007ed:   00 00                   add    %al,(%rax)
  4007ef:   00 04 00                add    %al,(%rax,%rax,1)
    ...
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
jerry g
  • 91
  • 3
  • This isn't code and it doesn't make sense to disassemble it. It's just the bytes `07 00 00 00 02 00 00 00` in memory, which are the two little-endian `int`s 7 and 2 that are the two members of `y1`. – Nate Eldredge Dec 29 '20 at 21:56
  • 2
    Looks like you are disasembling the data section of the application. Those are not instructions that is raw data. – Martin York Dec 29 '20 at 22:13
  • I was using objdump -D . It greatly confused me until commenter below clarified. – jerry g Dec 29 '20 at 22:21

1 Answers1

4

The variables are initialized by static initialization, meaning before any code (necessarily) executes. The implementation accomplishes this by storing the memory image in the compiled binary.

Look at the hexadecimal values: they match the numbers you assigned in the initializations. Those aren't instructions at all. The disassembler just printed add out of ignorance.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • `The disassembler just printed add out of ignorance.` cracked me up haha – Rafael de Bem Dec 29 '20 at 21:57
  • On the topic of static initialization, the jy variable in my example is NOT statically initialized? Because it depends on the k variable? (I see assembly instructions initializing it before main and it is located in the .bss section). Correct me if I am wrong. – jerry g Dec 29 '20 at 22:07
  • Good question! No, it's not, but the compiler is anyway free to optimize away those instructions as if it were. – Potatoswatter Dec 29 '20 at 22:12
  • It's not exactly ignorance. Those are the instructions the CPU would execute if you used `mprotect` to make that page executable and jumped there. Like you would for testing shellcode; `objdump -D` is only useful when you do care about interpreting data in other sections as machine code. (Or as a quick hack if you want to look at the hexdump part and ignore the disassembly). – Peter Cordes Dec 30 '20 at 00:35
  • @PeterCordes aha, I didn't notice that `-D` forces disassembly for all sections whereas `-d` is the more common option. – Potatoswatter Dec 30 '20 at 00:49
  • Yeah, OP only mentioned that in a comment. Even `const` globals will go in `.rodata` or `.rdata`, which is a separate section from `.text`, which is why this isn't normally a problem. (`objdump -d` would print empty output for the OP's file that has no functions. Or maybe an iostream constructor or something.) – Peter Cordes Dec 30 '20 at 00:51