Caffe Batch processing no speedup

Question

I would like to speedup the forward pass of classification of a CNN using caffe.

I have tried batch classification in Caffe using code provided in here: Modifying the Caffe C++ prediction code for multiple inputs This solution enables me to give a vector of Mat, but it does not speed up anything. Even though the input layer is modified.

I am processing pretty small images (3x64x64) on a powerful pc with two GTX1080, and there is no issue in terms of memory. I tried also changing the deploy.prototxt, but I get the same result.

It seems that at one point the forward pass of the CNN becomes sequential. I have seen someone pointing this out here also: Batch processing mode in Caffe - no performance gains

Another similar thread, for python : batch size does not work for caffe with deploy.prototxt

I have seen some things about MemoryDataLayer, but I am not sure this will solve my problem.

So I am kind of lost on what to do exactly... does anyone have any information on how to speedup classification time. Thanks for any help !

Hello,Yes I checked this as best I could. When using the Caffe::set_mode(Caffe::CPU) the processing time is muuuuuuuuch longer, there is also more activity on the CPU. When using the Caffe::set_mode(Caffe::GPU) I can see some activity on the GPU and the processing time is decreased by a large margin. Although it still seems underwhelming, I am using at most 20% of my GPU with roughly 650 Mb Ram used out of 8Gb... So I am wondering if I should do some magick Cuda trick or whatever to preload the images in memory... — Luc Mioulet, Feb 01 '17 at 08:03

Caffe Batch processing no speedup

0 Answers0