MKL or BLAS routine to multiply vector by a scalar out-of-place

Question

I work in simulation software and one of the many operations done on arrays is scaling a vector by a number.

I have code like this:

//Just some initialization code, don't bother about this part
int n = 10000;
std::vector<double> input(n, 42.0);
std::vector<double> output(input.size());

double alpha = 69.0;

//the actual calculation:
for (size_t i = 0; i < n; ++i) {
    output[i] = input[i] * alpha;
}

I have the MKL library available, so if my calculations are done "in-place" the following can be written:

cblas_dscal(n, alpha, &input[0], 1);

However, this will change the input variable, which is not what I want.

I tried using the mkl_domatcopy() but it is very slow for this operation.

`cblas_dcopy` then `cblas_dscal` or zero and `cblas_daxpy` ? Would that be slower ? — francis, Jan 13 '14 at 21:33
That's the solution I came up with francis. It is faster than the simple loop by a small amount, however it looks like a waste of time. But it will do for now. — Gustavo Muenz, Jan 14 '14 at 11:43

score 1 · Accepted Answer · answered Jan 14 '14 at 11:44

1

The solution I came up with was calling cblas_dcopy() then cblas_dscal().

It is not the best of all worlds but it is still faster than the raw loop.

answered Jan 14 '14 at 11:44

Gustavo Muenz

9,278
7
40
42

3

Did you try cblas_daxpy with a 0 array? – Vedaad Shakib Jul 19 '16 at 23:27
I didn't, but I will try that – Gustavo Muenz Oct 03 '16 at 14:09

score 1 · Answer 2 · answered Feb 05 '17 at 07:32

1

From MKL reference:

The cblas_?axpy routines perform a vector-vector operation defined as

y := a * x + y where: a is a scalar, x and y are vectors each with a number of elements that equals n.

answered Feb 05 '17 at 07:32

ZFY

135
12

score 1 · Answer 3 · answered Nov 10 '17 at 08:57

There is a routine within the BLAS-like extensions called cblas_?axpby

Here is an excerpt from the documentation:

Scales two vectors, adds them to one another and stores result in the vector.

The difference to cblas_?axpy is that you have second scaling parameter on the result vector y. In your case, you could just set b := 0.0 and thus have your out of place scaling in one call, instead of the two cblas_dcopy() and cblas_dscal().

Your code could look like this:

//Just some initialization code, don't bother about this part
int n = 10000;
std::vector<double> input(n, 42.0);
std::vector<double> output(input.size());

double alpha = 69.0;

//the actual calculation:
cblas_daxpby (output.size(), alpha, input.data(), 1, 0.0, output.data(), 1);

That looks like the perfect solution. Unfortunately I don't work at this company anymore and can't really be sure that this will be done. — Gustavo Muenz, Nov 16 '17 at 19:07
I didn't know `std::vector::data()` member function to retrieve a pointer to the internal raw array, which is generally faster than an iterator. Thank you. — ynn, Apr 02 '20 at 14:08

MKL or BLAS routine to multiply vector by a scalar out-of-place

3 Answers3