-1

I received a computer with 4xGPU's Tesla K80 and I am trying the parfor loops from Matlab PCT to speed up FFT's calculation and it is yet slower.

Here is what I am trying:

% Pupil is based on a 512x512 array

    parfor zz = 1:4
        gd = gpuDevice;
        d{zz} = gd.Index;
        probe{zz} = gpuArray(pupil); 
        Essai{zz} = gpuArray(pupil); 
    end

    tic;
    parfor ii = 1:4
        gd2 = gpuDevice;
        d2{ii} = gd2.Index;
        for i = 1:100
        [Essai{ii}] = fftn(probe{ii});
        end
    end
    toc
    %%

Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.
Elapsed time is 1.805763 seconds.
Elapsed time is 1.412928 seconds.
Elapsed time is 1.409559 seconds.

Starting parallel pool (parpool) using the 'local' profile ... connected to 8 workers.
Elapsed time is 0.606602 seconds.
Elapsed time is 0.297850 seconds.
Elapsed time is 0.294365 seconds.
%%
tic; for i = 1:400; Essai{1} = fftn( probe{1} ); end; toc
Elapsed time is 0.193579 seconds !!!

Why is opening 8 workers faster as in principle I stored my variables into 4gpu's only (out of 8)?

Also, how to use a Tesla K80 as a single GPU?

Merci, Nicolas

  • The K80 GPU is a Multi-Chip GPU board. Each K80 provides two GK210 (each with 12 GB GDDR5). These two chips are connected through a PCIe switch. From the user prespective (CUDA, etc..) each K80 board contains two GPUs, so it's possible that your 8 workers come from 4 boards x 2 = 8 GPUs available. – Hopobcn Sep 08 '16 at 08:45

2 Answers2

1

I doubt that parfor works for multi-GPU systems. If speed is critical and you want to take full advantage of your GPUs, I suggest to write your own little CUDA script using the cuFFT library: http://docs.nvidia.com/cuda/cufft/#multiple-GPU-cufft-transforms

Here is how to write your mex file containing CUDA code: http://www.mathworks.com/help/distcomp/run-mex-functions-containing-cuda-code.html

0

many thanks for your quick reply and for the links ! It is true that I was trying to avoid CUDA but it seems like the best option to spread FFTs. Although I thought that parfor and spmd were great tools for multiple GPUs..