1

On page three of this OpenCL reference sheet (broken link) there are two built in vector length functions with identical parameters: length() and half_length().

What is the difference between these functions? I gather from the name one is 'faster' than the other but in what circumstances? Does it sacrafice accuracy for this speed increase? If not, why would one ever use length() over fast_length()?

Patrick B.
  • 11,773
  • 8
  • 58
  • 101
sebf
  • 2,831
  • 5
  • 32
  • 50

1 Answers1

7

According to the OpenCL spec (version 1.1, page 215):

  • float length(floatn p): Return the length of vector p, i.e. sqrt(p.x²+p.y²+...)

  • float fast_length(floatn p): Return the length of vector p computed as half_sqrt(p.x²+p.y²+...)

So fast_length uses half_sqrt, while length uses sqrt. As you can guess sqrt has better guarantees on accuracy, but might be slower. More to the point:

  • Min Accuracy of sqrt: 3ulp (unit of least precision)

  • Min Accuracy of half_sqrt: 8192ulp

    So half_sqrt can be about 11bits less accurate then sqrt (well actually it can be 13 bit less accurate, since there ist no requirement for sqrt not to be better then strictly necessary). Since float has a mantissa of 23bit (plus one implicit bit) half_sqrt only promises about 10bit of precision (11bit including the implicit 1). It might however be faster, if the hardware has such a function. In hardware it's not unusual to have sqrt or rsqrt instruction providing only a small number of bits (like 10-14) and using Newton-Raphson iterations after the instruction to get the necessary precision. In such a case using half_sqrt is obviously faster.

Community
  • 1
  • 1
Grizzly
  • 19,595
  • 4
  • 60
  • 78
  • 1
    Thank you, especially for the explanation of the difference in accuracy and the source, its those details that allow for an informed choice between them. – sebf Apr 17 '12 at 21:36