I have a vector, and I would like to do the following, using CUDA and Thrust transformations:
// thrust::device_vector v;
// for k times:
// calculate constants a and b as functions of k;
// for (i=0; i < v.size(); i++)
// v[i] = a*v[i] + b*v[i+1];
How should I correctly implement this? One way I can do it is to have vector w, and apply thrust::transform onto v and save the results to w. But k is unknown ahead of time, and I don't want to create w1, w2, ... and waste a lot of GPU memory space. Preferably I want to minimize the amount of data copying. But I'm not sure how to implement this using one vector without the values stepping on each other. Is there something Thrust provides that can do this?