_axpy is a blas level one operation which implements following
for i = 1:n
a[i] = a[i]-$\alpha$ b[i]
There are efficient implementation of such regular daxpy available through various blas libraries such as MKL.
In my case I want to implement following variant of daxpy operation which uses indirect addressing.
for i = 1:n
a[ind1[i]] = a[ind1[i]]-$\alpha$ b[i]
where ind1 contains the index of elements of vector A , which needs to be updated. The information I have is that ind1 is an monotonous array i.e. $ind1[i]>ind[j] \forall i>j$.
I assume such computation arises very often in sparse linear algebra. Does anyone know of any efficient implementation of based on SSE/AVX for such routines.