9

this is the output of ginfo using Jacket/matlab:

Detected CUDA-capable GPUs:
CUDA driver 270.81, CUDA toolkit 4.0
GPU0 Tesla C1060, 4096 MB, Compute 1.3 (single,double) (in use)
GPU1 Tesla C1060, 4096 MB, Compute 1.3 (single,double)
GPU2 Quadro FX 1800, 742 MB, Compute 1.1 (single)
Display Device: GPU2 Quadro FX 1800

The problem is :

  1. Can I use the two Teslas at same time (parfor)? How?
  2. How to know number of cores are currently running/executing the program?
  3. After running the following code and make Quadro (in use) I found it takes less time than Tesla despite Tesla having 240 cores and Quadro has only 64? Maybe because it's the display device?maybe becouse it's single precision and Tesla is Double precision?
clc; clear all;close all;
addpath ('C:/Program Files/AccelerEyes/Jacket/engine');
i = im2double(imread('cameraman.tif'));
i_gpu=gdouble(i);
h=fspecial('motion',50,45);% Create predefined 2-D filter
h_gpu=gdouble(h);
tic;
for j=1:500
    x_gpu = imfilter( i_gpu,h_gpu );
  end
i2 = double(x_gpu); %memory transfer 
t=toc
 figure(2), imshow(i2);

Any help with the code will be appreciated. As you can see it's very trivial example used to demonstrate power of GPU, no more.

pyCuda
  • 233
  • 2
  • 13
  • 1
    Reading through the Matlab documentation it seems that only one GPU is used for GPU accelerated calculations (some please correct me if I am wrong), and Matlab provides functions for selecting an GPU to use in the case of multiple GPUs installed on a single machine. Does this question http://stackoverflow.com/questions/5473543/parallelizing-a-for-loop-to-run-simultaneously-on-multiple-gpu-cores help you at all? – Chris Dec 05 '11 at 09:42
  • As far as I can tell, Jacket is the only software that supports multi gpus for Matlab. You can find more information over here. http://wiki.accelereyes.com/wiki/index.php/Jacket_MGL *Disclaimer* I work at Accelereyes and am involved with the development of Jacket. – Pavan Yalamanchili Dec 05 '11 at 22:26
  • I see you're using jacket from accelereyes. I hope this link about Jacket MGL will help: http://www.accelereyes.com/products/jacket_multi_gpu – jwdmsd Dec 07 '11 at 08:23
  • can i get -- by profiling my code -- kernel/thread dimensions jacket used – pyCuda Apr 12 '12 at 08:54

2 Answers2

1

Using two Teslas at the same time: write a MEX file and call cudaChooseDevice(0), launch one kernel, then call cudaChooseDevice(1) and execute another kernel. Kernel calls and memory copies (i.e., cudaMemcpyAsync and cudaMemcpyPeerAsync) are asynchronous. I've given an example about how to write a MEX file (i.e., a DLL) in one of my other answers. Just add a second kernel to that example. FYI, you don't need Jacket if you can do some C/C++ programming. On the other hand, if you don't want to spend your time learning the Cuda SDK or you don't have a C/C++ compiler then you're stuck with Jacket or gp-you or GPUlib until Matlab changes the way that parfor works.

An alternative is to call OpenCL from Matlab (again through a MEX file). Then you could launch kernels on all the GPUs and CPUs. Again, this requires some C/C++ programming.

Community
  • 1
  • 1
user244795
  • 698
  • 6
  • 14
0

From Matlab 2012, GPU array and GPU related functions are fully integrated into the MATLAB so you might not need to use Jacket to achieve what you are trying to do.

In sum, put gpuDevice(deviceID); before running GPU commands, then the following codes will be run on the deviceIDth gpu. For instance

gpuDevice(1);
a = gpuArray(rand(3)); // a is on the first GPU memory
gpuDevice(2);
b = gpuArray(rand(4)); // b is on the second GPU memory

To run multiple GPUs. simply put

c = cell(1,num_device);
parfor i = 1:num_device
    gpuDevice(i);
    a = gpuArray(magic(3));
    b = gpuArray(rand(3));
    c{i} = gather(a*b);
end

You can see the GPU memory usage by typing nvidia-smi on the system command line.

This way of setting GPU id seems strange but it is the conventional way to set GPU id. In CUDA, if you want to use specific GPU then cudaSetDevice(gpuId) and the following codes will run on the gpuIdth GPU. (0-base indexing)

----------------------EDIT----------------

Tested and confirmed on MATLAB 2012b, MATLAB 2013b.

Checked using nvidia-smi that the code is actually using different GPU memories. You might have to scale it very large rand(5000) and check very quickly since temporary variables a and b would disappear after the for loop ends

VforVitamin
  • 946
  • 10
  • 12
  • 1
    What version of MATLAB does this require, btw? – Ben Voigt Oct 14 '14 at 22:59
  • Tested and confirmed on MATLAB 2012b, MATLAB 2013b. – VforVitamin Oct 15 '14 at 04:02
  • It's worth pointing out that you're talking about the [PCT Toolbox](http://www.mathworks.com/products/parallel-computing/) by MathWorks, while the original question was asking about the now [discontinued](http://blog.accelereyes.com/blog/2012/12/12/exciting-updates-from-accelereyes/) [Jacket](https://en.wikipedia.org/wiki/Jacket_%28software%29) package by AccelerEyes. – Amro Oct 15 '14 at 05:50
  • Ooops, I didn't read carefully, I should move the answer to a different thread – VforVitamin Oct 15 '14 at 06:14
  • @ChristopherB.Choy: I think it's fine here, just wanted to clear any confusion – Amro Oct 15 '14 at 08:36
  • Okay, since the title is so similar, I'll leave it here. But I'll slightly modify so that it is relevant to Jacket. – VforVitamin Oct 15 '14 at 18:55