Need to find when value is == 0, but I cannot due to numerical errors

Question

I have 2 large lists of vectors (>10,000 vectors each, say v_i and w_i) and I am trying to find when v_i cross-product w_i = 0, or, when v_i x w_i = 0.

The lists of vectors are previously calculated (this is Computational Fluid Dynamics and the calculated vectors represent properties of a fluid. I am doing research in Vortex Identification and this calculation is necessary).

I am trying to find when the cross product == 0 but I only get 3 results out of the thousands where the cross product is satisfied. We are trying to automate a method done by hand so we know for a fact that there are more than 3 vectors.

Our assumption is that since we are using basic numerical methods (of low orders) to calculate the vectors, there is a build up of errors.

TLDR: In essence, this does not work due to numerical errors:

real :: cross1, cross2, cross3
logical :: check1, check2, check3
logical :: is_seed

check1 = cross1 == 0.0
check2 = cross2 == 0.0
check3 = cross3 == 0.0

is_seed = check1 .and. check2 .and. check3

so, we have to do this:

real :: cross1, cross2, cross3
real :: tol
logical :: check1, check2, check3
logical :: is_seed

tol = 1.0e-4 ! NEED TO FIND OUT HOW TO CALCULATE

check1 = cross1 <= (0.0 + tol)
check2 = cross2 <= (0.0 + tol)
check3 = cross3 <= (0.0 + tol)

is_seed = check1 .and. check2 .and. check3

but I want to know how to calculate tol automatically and not hard code it. How can this be done?

You said you want to calculate an acceptable tolerance, yet you did not provide any information about how you want to go about doing that. What would you consider an acceptable tolerance? — ikegami, Nov 04 '21 at 23:42
Because although the language this is written in is Fortran, this is not a Fortran only question. C has compiler ties to Fortran (typically) so people that program in C can use programs written in Fortran in C code. So some people that have experience with C have experience with interacting with Fortran. — Oscar Alvarez, Nov 04 '21 at 23:42
So you tagged it [tag:c] cause it might get people familiar with FORTRAN? That's not right. That's what [tag:fortran] is for. — ikegami, Nov 04 '21 at 23:43
@ikegami That is what I am trying to determine. I want the lowest values up to a certain threshold, BUT I want the threshold to be determined by the data, not me. — Oscar Alvarez, Nov 04 '21 at 23:44
yes, but you still haven't said what formula you want to use. You still need to determine that. — ikegami, Nov 04 '21 at 23:44
@ikegami that is not why I tagged C. If it would make you feel better, I can remove the C tag. And as for the second question, this is the intuition that I am needing. Yes, I require a formula to calculate tol, but I do not know which one to use. If you have some sort of insight or a way for me to find it on my own, I will be grateful. Thank you. — Oscar Alvarez, Nov 04 '21 at 23:47
Well, for the best (smallest) tolerance, you want to calculate the maximum error you could get, and go a bit higher than that. — ikegami, Nov 04 '21 at 23:55
"how to calculate tol automatically" sounds more like a maths/physics question than a programming question. I suggest trying [https://math.stackexchange.com/](https://math.stackexchange.com/) or [https://physics.stackexchange.com/](https://physics.stackexchange.com/). If you get a method for calculating tol from there, and still need help coding it up, then that would be a question for us here. — veryreverie, Nov 05 '21 at 08:13
It sounds more like a numerical analysis problem to me. I think computational science or the computer science will be the best place - but be prepared to pose your problem in an abstract, language agnostic way (you have to do this anyway, the OP is quite right when he/she says C is as good a fit for this as Fortran. As is C++, python, basic, anything with floating point maths) — Ian Bush, Nov 05 '21 at 08:50
Quick comment on the language - what is boolean? You mean `logical` surely — Ian Bush, Nov 05 '21 at 08:50
@veryreverie I was thinking about doing that but I was afraid of getting roasted by the mods telling my to post coding questions here. But that is a good idea, if my question does not gain traction here, I will move it to math.stackexchange. Thank you. And yes, @IanBush, you are quite right, it is supposed to be `logical`. I suppose I was a bit too hasty thank you. — Oscar Alvarez, Nov 05 '21 at 16:22

John Alexiou · Answer 1 · 2021-11-05T19:08:56.607

Edit 1

As pointed out in the comments, the function below is entirely equivalent to the built-in function spacing(x).

Edit 2

Use the following function ulp(x) to find the value of the least significant bit in the mantissa of an ieee754 number x

32-bit

 elemental function ulp32(x) result(d)
 real(real32), intent(in) :: x
 real(real32) :: d
     d = 2.0**(-floor(-log(x)/log(2e0))-24)
 end function

64-bit

 elemental function ulp64(x) result(d)
 real(real64), intent(in) :: x
 real(real64) :: d
     d = 2d0**(-floor(-log(x)/log(2d0))-53)
 end function

interface

 interface ulp
     procedure :: ulp32, ulp64
 end interface

with some results given values between 1 and 1e9

              x            32bit                    64bit
         517.54       0.00006104      0.00000000000011369
        1018.45       0.00006104      0.00000000000011369
        1972.33       0.00012207      0.00000000000022737
        5416.69       0.00048828      0.00000000000090949
       11812.67       0.00097656      0.00000000000181899
       13190.24       0.00097656      0.00000000000181899
       18099.97       0.00195312      0.00000000000363798
       28733.47       0.00195312      0.00000000000363798
       86965.21       0.00781250      0.00000000001455192
      135734.23       0.01562500      0.00000000002910383
      203975.41       0.01562500      0.00000000002910383
      780835.66       0.06250000      0.00000000011641532
     2343924.58       0.25000000      0.00000000046566129
     2552437.80       0.25000000      0.00000000046566129
     6923904.28       0.50000000      0.00000000093132257
     8929837.66       1.00000000      0.00000000186264515
    29408286.38       2.00000000      0.00000000372529030
    70054595.74       8.00000000      0.00000001490116119
   231986024.46      16.00000000      0.00000002980232239
   392724963.99      32.00000000      0.00000005960464478

It is recommended to pick a tol value that is a factor of ulp, and this factor should be a power of two. Each power means shifting one bit over to increase the tolerance by a power of two. You can expect each operation that propagates round-off errors to also make the error larger proportionally to 2**n where n is the number of operations.

So depending on the magnitude of the values compared, the tolerance should be approximated by tol = factor * abs(x) * 2**(-24)

For example, comparing two values of x=12418.16752 and y=12418.16774 pick a tolerance with

tol = 8*ulp(15000.0)
check = abs(x-y) <= tol

I get a tol=7.8125000E-03 and the result check=.true.

Edit 0

<Post deleted>

IMO this doesn't answer the question, which is how to calculate `tol`, not how to use it — Ian Bush, Nov 05 '21 at 08:51
@IanBush - I see that now. I edited the post with the way I have decomposed floating-point numbers into a mantissa and exponent, and then found the value of the least significant bit of the mantissa. — John Alexiou, Nov 05 '21 at 13:32
Does this function `ulp` do anything that the intrinsic function `spacing` doesn't do? — High Performance Mark, Nov 05 '21 at 14:17
@JohnAlexiou This is fantastic, thank you. I will apply this to my code and see how it works out. I will let you know if this turns out to be the solution to my problem. — Oscar Alvarez, Nov 05 '21 at 16:48
@HighPerformanceMark - I wasn't aware of the `spacing(x)` function, and it seems to be exactly the same. — John Alexiou, Nov 05 '21 at 19:07

score 2 · Answer 2 · answered Nov 06 '21 at 10:44

In the first place, you should have knowledge of the error on the vector components, otherwise no test for zero can be conclusive.

Now the absolute error on the cross product is like

(u + δu) x (v + δv) - uv ~ u x δv + δu x v

and in the worst case the vectors can be orthogonal, giving the estimate |u||δu|+|v||δv|=(|u|+|v|)δ. So a value of |u x v| below this bound could correspond to parallel vectors.

score 0 · Answer 3 · answered Nov 15 '21 at 21:16

I found a solution to my problem.

First, I take the magnitude of the vector. I do this so I only have to work with one value instead of 3. This is possible since ||v|| = 0 if and only if v = 0. I save the magnitude of those vectors in a new array called cross_mag (since the vector is the result of a cross product).
Then I find the lowest value in the array that is not zero. (This is to discount outliers that may be equal to zero)
I found that when the number is written in scientific notation, the exponent of 10 will give me a power x that I can base my tolerance off of. I do this using log₁₀( min_value ).
I then increase the power of the lowest value by 1, which increases the total tolerance directly by a factor of 10.
I use this new value as the exponent of my tol. (This can of course be scaled which I have done by a factor of 1.5).

Or:

real, dimension(:,:,:) :: cross_mag
real :: min_val, ex, tol
integer :: imax, jmax, kmax

! Find new "zero" that is based off of the lowest values.
! This new zero is required due to the buildup of numerical errors.
min_val = rrspacing(1.0)

do k = 1, kmax
  do j = 1, jmax
    do i = 1, imax

      if ((cross_mag(i,j,k) < min_val) .and. (cross_mag(i,j,k) .ne. 0.0)) then

        min_val = cross_mag(i,j,k)

      end if

    end do
  end do
end do

ex = log10(abs(min_val))
ex = floor(ex)

tol = 1.5 * 10.0**(ex + 1.0)

write(*,*) 'min_val: ', min_val
write(*,*) 'tol: ', tol

I found this works plenty well for my work and gives me a reasonable amount of vectors to work with. I thank you all for helping my find the rrspacing() function which helped me create an arbitrarily large number.

Are you sure you don't have the case where two vectors are very close to parallel but not quite parallel? Because then `||x^y||<<||x||`, so you might be missing cases. — veryreverie, Nov 16 '21 at 09:27

Need to find when value is == 0, but I cannot due to numerical errors

3 Answers3

Edit 1

Edit 2

Edit 0