In the documentation for the ARM instruction frsqrts, it says:
This instruction multiplies corresponding floating-point values in the vectors of the two source SIMD and FP registers, subtracts each of the products from 3.0, divides these results by 2.0, places the results into a vector, and writes the vector to the destination SIMD and FP register.
I interpret this as yₙ₊₁ = (3 - xyₙ)/2-and indeed the following code justifies this interpretation:
.global _main
.align 2
_main:
fmov d0, #2.0 // Goal: Compute 1/sqrt(2)
fmov d1, #0.5 // initial guess
frsqrts d2, d0, d1 // first approx
mov x0, 0
mov x16, #1 // '1' = terminate syscall
svc #0x80 // "supervisor call"
However, reading about the Newton iterate for the inverse square root, I see that the iteration is not yₙ₊₁ = (3 - xyₙ)/2, but rather yₙ₊₁ = yₙ(3 - xyₙ²)/2. Now, obviously I can use frsqrt
in combination with other instructions to get this:
fmov d0, #2.0 // Goal: Compute 1/sqrt(2)
fmov d1, #0.5 // initial guess
fmul d2, d1, d1 // initial guess squared
frsqrts d3, d0, d2 // (3-r*r*x)/2
fmul d4, d1, d3 // d4 = r*(3-r*r*x)/2
But is seems weird to introduce a custom instruction which only get your halfway to your goal. Am I misusing this instruction?