How to compute the sum of the values of elements in a vector using cblas functions?

Question

I need to sum all the elements of a matrix in caffe,

But as I noticed, the caffe wrapper of the cblas functions ('math_functions.hpp' & 'math_functions.cpp') is using cblas_sasum function as caffe_cpu_asum that computes the sum of the absolute values of elements in a vector.

Since I'm a newbie in cblas, I tried to find a suitable function to get rid of absolute there, but it seems that there is no function with that property in cblas.

Any suggestion?

Shai · Accepted Answer · 2016-08-02T16:23:32.130

There is a way to do so using cblas functions, though it is a bit of an awkward way.

What you need to do is to define an "all 1" vector, and then do a dot product between this vector and your matrix, the result is the sum.

Let myBlob be a caffe Blob whose elements you want to sum:

vector<Dtype> mult_data( myBlob.count(), Dtype(1) );
Dtype sum = caffe_cpu_dot( myBlob.count(), &mult_data[0], myBlob.cpu_data() );

This trick is used in the implementation of "Reduction" layer.

To make this answer both GPU compliant, one need to allocate a Blob for mult_data and not a std::vector (because you need it's pgu_data()):

vector<int> sum_mult_shape(1, diff_.count());
Blob<Dtype> sum_multiplier_(sum_mult_shape);
const Dtype* mult_data = sum_multiplier_.cpu_data();
Dtype sum = caffe_cpu_dot( myBlob.count(), &mult_data[0], myBlob.cpu_data() );

For GPU, (in a '.cu' source file):

vector<int> sum_mult_shape(1, diff_.count());
Blob<Dtype> sum_multiplier_(sum_mult_shape);
const Dtype* mult_data = sum_multiplier_.gpu_data();
Dtype sum;
caffe_gpu_dot( myBlob.count(), &mult_data[0], myBlob.gpu_data(), &sum );

kangshiyin · Answer 2 · 2016-08-01T19:47:37.797

0

Summation of all the elements of an array is simple enough to be implemented by a single for-loop. You only need to use proper compile options to vectorise it with SIMD instructions.

For Blob in caffe, you could use .cpu_data() to get the raw pointer of the array and then use for-loop.

edited Aug 01 '16 at 19:47

answered Aug 01 '16 at 19:03

kangshiyin

9,681
1
17
29

Thanks for your replay, actually, lots of cblas functions are doing too simple operations efficiently, but I have to do it in caffe without using a simple for loop to prevent poor performance. By the way, how can I loop over all the values in the data_ variable of a Blob in caffe? – Ali Sharifi B. Aug 01 '16 at 19:12
for-loop has good performance on this. You could use proper compile options to vectorize it. – kangshiyin Aug 01 '16 at 19:30

How to compute the sum of the values of elements in a vector using cblas functions?

2 Answers2

Linked