3

I have acquired the Nvidia Jetson TK1 a few weeks ago and I'm trying to use CPU and GPU at the same time, hence the use of the Stream class. With a simple test I realize it does not do what I think it should, I'm probably using it wrong, or maybe a compiler option.

I checked this link for answers before posting this question : how to use gpu::Stream in OpenCV?

Here is my code :

#include <stdio.h> 
#include <iostream>   
#include "opencv2/core/core.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/objdetect/objdetect.hpp"
#include "opencv2/gpu/gpu.hpp"
#include <time.h> 

using namespace cv;
using namespace std;
using namespace gpu;    


int main(int argc,char** argv)    
{    
 unsigned long AAtime=0, BBtime=0;  
gpu::setDevice(0);
gpu::FeatureSet(FEATURE_SET_COMPUTE_30);
Mat host_src= imread(argv[1],0);
GpuMat gpu_src, gpu_dst;

Stream stream;

gpu_src.upload(host_src);

AAtime = getTickCount(); 
blur(gpu_src, gpu_dst, Size(5,5), Point(-1,-1), stream);

//Cpu function
int k=0;
for(unsigned long long int j=0;j<10;j++)
for(unsigned long long int i=0;i<10000000;i++)
 k+=rand(); 

stream.waitForCompletion();
Mat host_dst;
BBtime = getTickCount();  
 cout<<(BBtime - AAtime)/getTickFrequency()<<endl;
gpu_dst.download(host_dst);

 return 0;  

}   

With the timer function I saw that the overall time is CPU + GPU, not the longest of the two, so they do not work in parallel. I tried using the CudaMem as jet47 showed but when I watch the image it's only stripes and not my image:

CudaMem host_src_pl(Size(900, 1200), CV_8UC1, CudaMem::ALLOC_PAGE_LOCKED); // My image is 1200 by 900
CudaMem host_dst_pl;
Mat host_src= imread(argv[1],0);
host_src = host_src_pl;
//rest of the code

To compile I used this command : "g++ -Ofast -mfpu=neon -funsafe-math-optimizations -fabi-version=8 -Wabi -std=c++11 -march=armv7-a testStream.cpp -fopenmp -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_calib3d -lopencv_contrib -lopencv_features2d -lopencv_flann -lopencv_gpu -lopencv_legacy -lopencv_ml -lopencv_objdetect -lopencv_photo -lopencv_stitching -lopencv_superres -lopencv_video -lopencv_videostab -o gpuStream" Some might be redundant, I tried without them and it does the same.

What do I miss? Thanks for you answers :)

Community
  • 1
  • 1
Raziel
  • 31
  • 3
  • By chance that you're still working on this, did you ever use nvprof to profile the program? It would display when the GPU kernels are running, what stream they're running on, etc. – Kelsius Jul 26 '17 at 12:20

0 Answers0