I am writting a classifier in c++ with caffe in windows. Caffe uses GPU to perform this task taking the images from file or memory. In all the samples that I have reviewed the images are sent internally to the GPU by caffe, but in my application I have already the images in the GPU because I need some pretreatment that is done in GPU with CUDA.
My question is, is it possible to feed the classifier from a cuda kernell or from the .cu file getting the images directly from cuda memory?? or should I copy the pretreated image to the CPU and caffe loads the image again to the GPU to classify??. I feel that it should be a method to avoid the double copy to the GPU, but I cant find it..
It seems that Nvidia TensorRT will manage this situation, but it is not available yet for windows.