Scikit cuda FFT large data

Question

I recently downloaded the newest scikit for work with FFTs. However, I have run into a problem. I have data size and window size of 2^19. The size of array going into the fft function is 524288, which is far below the 2^27 element limit listed in the documentation.

multiply_them = ElementwiseKernel(
        "float *dest, float *a, float *b",
        #{
        #const int i = blockIdx.x +threadIdx.x;
        "dest[i] = a[i] * b[i]",
        #}
        "linear_combination")
        #multiplythem = mod.get_function("multiply_them")

        gval1 = gpuarray.to_gpu(val1.astype(numpy.float32)) #gval1 = input * rescale * gain
        gwindow = gpuarray.to_gpu(window.astype(numpy.float32))  #gwindow = filtering window
        gval2 = gpuarray.to_gpu(numpy.zeros_like(gval1.get()))#.astype(np.float32)) #set up zero array



        #val2 = numpy.zeros_like(val1).astype(numpy.float32)
        multiply_them(gval2, gval1, gwindow) # block=(max_block_dim,1,1), grid=(grid_dim,1)) #gval2 = gval1 .* gwindow

        val1 = gval2.get() #retrieve val1 from GPU


        #gval1 = fft(gval1,fft_window_size);
        #gval1 = fftshift(gval1,1);
        #gval1 = abs(gval1);
        gval1 = gpuarray.to_gpu(val1)
        gval2 = gpuarray.to_gpu(numpy.empty(fft_window_size, numpy.complex64))
        plan_forward = cu_fft.Plan(gval1.shape[0]*2, numpy.float32, numpy.complex64)
        cu_fft.fft(gval1, gval2, plan_forward)
        #val2 = scipy.fftpack.fft(val1,fft_window_size)
        val1 = gval2.get()

Yet, when I run the code and check it with MATLAB and Scipy's FFT functions, the values trail off to zero half-way through the computations. I can't figure out how to increase the batch size and still have correct numbers. Some advice would be nice.

Real to complex transforms are symmetrical. CuFFT exploits this and only computes half of the solution. Are you sure this isn't what you are seeing? — talonmies, Aug 19 '16 at 13:02
@talonmies Yes, I can see that happening, but I am a little concerned because the values I have at the half-way point don't match MATLAB and Scipy. However, the sum of the arrays up to that point match. Should I worry about the different in values that I see? — SanticL, Aug 19 '16 at 13:54
The cuFFT solutions require normalising. That is probably the difference — talonmies, Aug 19 '16 at 16:22
@talonmies How do I do that? Is there already an answer here for that? — SanticL, Aug 19 '16 at 19:49
Divide the transform by the number of the samples in the input data. But is just a guess, You haven't really shown any useful comparison of any of the problems you have been asking about here, so it is very hard to provide you with a concrete answer to this question — talonmies, Aug 20 '16 at 19:52
@talonmies I looked at plots from both matlab and pycuda and they appear to be nearly identical (with some more precision on pycuda). Thank you very much for answering my questions though! — SanticL, Aug 22 '16 at 12:41

Scikit cuda FFT large data

0 Answers0