3

I'm trying to print a floating point number by calling printf but it seems to always just print the pi value (3.1415) although the result, which is supposed to be the area of a circle, is supposed to be moved to the pi variable after being calculated.

.section .data
    value:
        .quad 0
    result:
            .asciz "The result is %lf \n"
    pi:
        .double 3.14159

.section .bss
.section .text
.globl _start
.type area, @function
area:

    nop
    imulq %rbx, %rbx
    movq %rbx, value
    fildq value
    fmul pi                           # multiply r^2 by pi
    fst  pi                           # Store result to pi
    movupd pi, %xmm0                  # move result to xmm0
    nop
    ret

_start:

    nop
    movq $2, %rbx
    call area                 # calculate for radius 2
    leaq result, %rdi         
    movq $1, %rax             # specify only one float value
    call printf                 
    movq $0, %rdi             # Exit
    call exit                     
    nop

I always get 3.1415 back. I dont know why as it's supposed to be overwritten by the fst instruction.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
KMG
  • 1,433
  • 1
  • 8
  • 19
  • 1
    `movupd` loads 16 bytes, but you only have one `.double`. Use `movsd`. Also, shouldn't you be storing to `value` instead of overwriting your `pi` constant? Also, if you insist on using legacy x87, use `fldpi` to get a more accurate pi constant. Also, the standard calling convention passes the first arg in RDI, not RBX. Your area function is super weird. – Peter Cordes Jan 28 '21 at 19:22
  • @PeterCordes thanks a lot this function was just for testing so I didn't pay attention to much details but why moving 16 bytes didn't seem to cause any problem even when changing order of defining ```pi``` variable in ```.data``` section – KMG Jan 28 '21 at 21:54
  • Unless it's the last 8 bytes before an unmapped page, it won't actually fault. But that could happen if you link with other files that also put stuff in .data. Functions that accept a `double` arg in an XMM register don't care if the top half of the register is zero or not. Loading a high half is usually just inefficient (store forwarding stall after an 8-byte store, and possibly cache-line split, and on old CPUs `movupd` is inherently slower than `movsd` or even `movapd`.) – Peter Cordes Jan 29 '21 at 00:55
  • 2
    @KhaledGaber Just because it doesn't seem to cause any problems right now doesn't mean it's correct. Wrong code may exhibit its defects only when you expect it least. – fuz Jan 29 '21 at 01:53

1 Answers1

4

You need to add a size suffix to your floating point operations if they happen to use memory operands. Otherwise, the GNU assembler will implicitly use single precision which is not what you want. To fix your code, change

fmul pi                           # multiply r^2 by pi
fst  pi                           # Store result to pi

to

fmull pi                           # multiply r^2 by pi
fstl  pi                           # Store result to pi

Some other remarks about your code:

  • use rip-relative addressing modes instead of absolute addressing modes if possible. Specifically, this means to replace foo with foo(%rip) in your memory operands, including for lea result(%rip), %rdi

  • make sure to leave a clean x87 stack at the end of your functions or other code may spuriously cause it to overflow. For example, use fstpl pi(%rip) to store the result and pop it off the stack.

  • use movsd, not movupd to load one double into an SSE register, not a pair.

  • consider using SSE instead of x87 if possible for all the math. It's the standard way to do scalar FP math in x86-64, that's why XMM registers are part of the calling convention. (Unless you need 80-bit extended precision, but you have a pi constant in memory that's far less accurate than x87 fldpi.)

       ...
       cvtsi2sd   %rbx, %xmm0
       mulsd      pi(%rip), %xmm0
       ret
    
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
fuz
  • 88,405
  • 25
  • 200
  • 352
  • It worked now thank's alot but I'm curious why it just echoed back the pi value instead of viewing whatever value stored in lower 32-bit of pi. – KMG Jan 28 '21 at 16:01
  • 2
    @KhaledGaber By operating on single precision, only the least 32 bit of `pi` were operated on. When printing, these bits do not have enough precision to affect the value displayed. – fuz Jan 28 '21 at 16:02