I have the following C program:
int main() {
float number1, number2, sum=0.;
number1 = .5;
number2 = .3;
while(sum > -10000000.)
sum -= number1 + number2;
printf("%f",sum);
return 0;
}
Its corresponding assembly is as follows:
_main: ; @main
.cfi_startproc
; %bb.0:
sub sp, sp, #16 ; =16
.cfi_def_cfa_offset 16
str wzr, [sp, #12]
str wzr, [sp]
mov w8, #1056964608
str w8, [sp, #8]
mov w8, #39322
movk w8, #16025, lsl #16
str w8, [sp, #4]
LBB0_1: ; =>This Inner Loop Header: Depth=1
ldr s0, [sp]
fcvt d0, s0
adrp x8, lCPI0_0@PAGE
ldr d1, [x8, lCPI0_0@PAGEOFF]
fcmp d0, d1
b.le LBB0_3
; %bb.2: ; in Loop: Header=BB0_1 Depth=1
ldr s0, [sp, #8]
ldr s1, [sp, #4]
fadd s1, s0, s1
ldr s0, [sp]
fsub s0, s0, s1
str s0, [sp]
b LBB0_1
LBB0_3:
mov w0, #0
add sp, sp, #16 ; =16
ret
.cfi_endproc
; -- End function
.subsections_via_symbols
I want to analyse latency of each instructions so I'm looking for ways to obtain program counter trace.
Desired output is as follows:
0000000000 _main: ; @main
0000000001 .cfi_startproc
0000000002; %bb.0:
0000000003 sub sp, sp, #16 ; =16
0000000004 .cfi_def_cfa_offset 16
0000000005 str wzr, [sp, #12]
0000000006 str wzr, [sp]
0000000007 mov w8, #1056964608
0000000008 str w8, [sp, #8]
0000000009 mov w8, #39322
0000000010 movk w8, #16025, lsl #16
0000000011 str w8, [sp, #4]
...
where the first columns is the timestamp either in pico/nano/microseconds.
Target system is macOS, compiler is llvm, debugger is lldb.