1

I was trying to test the output of an fft against a numpy fft for unit testing, I realized soon after when it failed, it wasn't because I had done something wrong, but skcuda literally doesn't produce the same answer. I knew they were going to be different by a bit, but at least one of the numbers is several orders of magnitude off of what numpy produces, and both allclose and almost_equal return massive errors (33% and 25% for rtol=1e-6, 16% for atol=1e-6). What am I doing wrong here? Can I fix this?

Test file:

import pycuda.autoinit
from skcuda import fft
from pycuda import gpuarray
import numpy as np

def test_skcuda():
    array_0 = np.array([[1, 2, 3, 4, 5, 4, 3, 2, 1, 0]], dtype=np.float32)
    array_1 = array_0 * 10
    time_domain_signal = np.array([array_0[0], array_1[0]], dtype=np.float32)
    fft_point_count = 10
    fft_plan = fft.Plan(fft_point_count, np.float32, np.complex64,
                        batch=2)
    fft_reserved = gpuarray.empty((2, fft_point_count // 2 + 1), dtype=np.complex64)
    fft.fft(gpuarray.to_gpu(time_domain_signal), fft_reserved, fft_plan)

    np.testing.assert_array_almost_equal(
        np.fft.rfft(time_domain_signal, fft_point_count), fft_reserved.get())

test_skcuda()

Assertion failure:

AssertionError: 
Arrays are not almost equal to 6 decimals

(mismatch 25.0%)
 x: array([[ 2.500000e+01+0.000000e+00j, -8.472136e+00-6.155367e+00j,
        -1.193490e-15+2.331468e-15j,  4.721360e-01-1.453085e+00j,
         2.664535e-15+0.000000e+00j,  1.000000e+00+0.000000e+00j],...
 y: array([[ 2.500000e+01+0.000000e+00j, -8.472136e+00-6.155367e+00j,
         8.940697e-08+5.960464e-08j,  4.721359e-01-1.453085e+00j,
         0.000000e+00+0.000000e+00j,  1.000000e+00+0.000000e+00j],...

printed output:

#numpy
[[ 2.50000000e+01+0.00000000e+00j -8.47213595e+00-6.15536707e+00j
  -1.19348975e-15+2.33146835e-15j  4.72135955e-01-1.45308506e+00j
   2.66453526e-15+0.00000000e+00j  1.00000000e+00+0.00000000e+00j]
 [ 2.50000000e+02+0.00000000e+00j -8.47213595e+01-6.15536707e+01j
  -1.11022302e-14+2.39808173e-14j  4.72135955e+00-1.45308506e+01j
   3.55271368e-14+7.10542736e-15j  1.00000000e+01+0.00000000e+00j]]

#skcuda
[[ 2.5000000e+01+0.0000000e+00j -8.4721355e+00-6.1553669e+00j
   8.9406967e-08+5.9604645e-08j  4.7213593e-01-1.4530852e+00j
   0.0000000e+00+0.0000000e+00j  1.0000000e+00+0.0000000e+00j]
 [ 2.5000000e+02+0.0000000e+00j -8.4721359e+01-6.1553673e+01j
   1.4305115e-06-4.7683716e-07j  4.7213597e+00-1.4530851e+01j
   0.0000000e+00+1.9073486e-06j  1.0000000e+01+0.0000000e+00j]]
Krupip
  • 4,404
  • 2
  • 32
  • 54
  • @CrisLuengo I've done this with `np.testing.assert_allclose`, which resulted still in 33% error with `rtol=1e-6`, and I agree, some are within numerical precision, but specifically the 3rd x axis element for each row, for each numpy array are *vastly* different from one another, several orders of magnitude apart. – Krupip Jun 28 '19 at 18:40
  • @CrisLuengo I meant to use atol, not rtol, ended up getting %16.6 difference still, I assume from that 3rd column value. – Krupip Jun 28 '19 at 18:44
  • @CrisLuengo Cuda will use what ever you tell it to, but if you look here, I force everything to be np.float32, so yes, everything is using 32bit floats. – Krupip Jun 28 '19 at 18:49
  • @CrisLuengo Okay, that makes sense, but still, is there an error function that numpy has that will actually recognize that? So far it seems all the approx equal functions they have will still reject it, and I'm not sure under what scenarios and how I can say "these things are close enough!" – Krupip Jun 28 '19 at 18:52
  • @CrisLuengo well its using 32bit float reals, and creating complex numbers during the FFT transformation, but complex64 is two 32bit floats for the real and imaginary part. – Krupip Jun 28 '19 at 18:54

3 Answers3

1

The output of an FFT has an error that is relative to the magnitude of the input values. Each output element is computed from combining all input elements, and therefore it is their magnitudes that determine the precision of the result.

You are computing two 1D FFTs in the same array. They each have different magnitude inputs, and therefore should have different magnitude tolerances.

The following quick code demonstrates how you could implement this. I don't know how to tweak any of the functions in numpy.testing to do this.

import numpy as np

array_0 = np.array([[1, 2, 3, 4, 5, 4, 3, 2, 1, 0]], dtype=np.float32)
array_1 = array_0 * 10
time_domain_signal = np.array([array_0[0], array_1[0]], dtype=np.float32)

# numpy result
a=np.array([[ 2.50000000e+01+0.00000000e+00j, -8.47213595e+00-6.15536707e+00j,
  -1.19348975e-15+2.33146835e-15j,  4.72135955e-01-1.45308506e+00j,
   2.66453526e-15+0.00000000e+00j,  1.00000000e+00+0.00000000e+00j],
 [ 2.50000000e+02+0.00000000e+00j, -8.47213595e+01-6.15536707e+01j,
  -1.11022302e-14+2.39808173e-14j,  4.72135955e+00-1.45308506e+01j,
   3.55271368e-14+7.10542736e-15j,  1.00000000e+01+0.00000000e+00j]])

# skcuda result
b=np.array([[ 2.5000000e+01+0.0000000e+00j, -8.4721355e+00-6.1553669e+00j,
   8.9406967e-08+5.9604645e-08j,  4.7213593e-01-1.4530852e+00j,
   0.0000000e+00+0.0000000e+00j,  1.0000000e+00+0.0000000e+00j],
 [ 2.5000000e+02+0.0000000e+00j, -8.4721359e+01-6.1553673e+01j,
   1.4305115e-06-4.7683716e-07j,  4.7213597e+00-1.4530851e+01j,
   0.0000000e+00+1.9073486e-06j,  1.0000000e+01+0.0000000e+00j]])

# Tolerance for result array row relative to the mean absolute input values
# 1e-6 because we're using single-precision floats
tol = np.mean(np.abs(time_domain_signal), axis=1) * 1e-6

# Compute absolute difference and compare that to our tolearances
diff = np.abs(a-b)
if np.any(diff > tol[:,None]):
   print('ERROR!!!')
Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
1

this looks a lot like rounding errors, single-precision floats have ~8 decimal digits of precision (doubles have ~16)

instead of using numpy.fft an alternative would be to use fftpack from scipy which supports single precision floats directly, e.g:

from scipy import fftpack

x = np.array([1, 2, 3, 4, 5, 4, 3, 2, 1, 0])

y = fftpack.fft(
    np.array([x, x * 10], dtype=np.float32)
)
print(y[:,:6])

outputting:

[[ 2.5000000e+01+0.0000000e+00j -8.4721355e+00-6.1553669e+00j
   8.9406967e-08+5.9604645e-08j  4.7213593e-01-1.4530852e+00j
   0.0000000e+00+0.0000000e+00j  1.0000000e+00+0.0000000e+00j]
 [ 2.5000000e+02+0.0000000e+00j -8.4721359e+01-6.1553673e+01j
   1.1920929e-06+1.9073486e-06j  4.7213583e+00-1.4530851e+01j
   0.0000000e+00+1.9073486e-06j  1.0000000e+01+0.0000000e+00j]]

which looks to be much closer

Sam Mason
  • 15,216
  • 1
  • 41
  • 60
  • I did not realize that numpys fft uses double precision internally even if you use single precision as input. – Krupip Jun 29 '19 at 15:21
  • numpy tends to expose the basics while scipy has a lot more, numpy's [linear algebra](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html) is very spartan compared to [scipy's](https://docs.scipy.org/doc/scipy/reference/linalg.html) – Sam Mason Jun 29 '19 at 15:45
0

Tiny result values of e-8 (float) and e-15 (double) from an FFT (given full scale input or output of anywhere near 1.0) are essentially equal to zero (plus numerical rounding noise).

and

zero + noise == zero + noise

so your results might actually be the same.

hotpaw2
  • 70,107
  • 14
  • 90
  • 153
  • pretty sure that should be "one + noise", those values are all around `np.finfo(np.float32).eps` (or similar for doubles). single-precision floats are fine a long way past 1e+-30, and doubles out past 1e+-300. – Sam Mason Jun 29 '19 at 15:39