I have a question related with the ArrayFire library and the use of memory. I implemented some program in plain CUDA/C , and the same programm using ArrayFire, and the CUDA/C program is much faster ( like 5 times faster than the ArrayFire one).
I check the Nvidia profiler with both of them and the main difference I see is memcpy operations, in the case of ArrayFire there are a lot of Memcpy operations, in the other case just a few in the begining of the program. Doing some tests I find out that doing something like :
f = f*q;
being f,q arrays generate more of this memcpy calls... i think this is the reason why my ArrayFire code don't perform better. why this happens? from where, come all this Memcpys? how i can avoid it ? ***** // edit //// a fragment of code
void Adveccion(){
for(int i = 0; i< q ; i++){
f(span,span,span,i) = shift( f(span,span,span,i) , V[1][i] , V[0][i] , V[2][i] );
}
}
f is a four dimensional array. and i have this function inside other for loop. If i modify the function like:
void Adveccion(){
for(int i = 0; i< q ; i++){
shift( f(span,span,span,i) , V[1][i] , V[0][i] , V[2][i] );
}
}
the profiler dont show the massive use of memcpys. Think my problem is find the correct way to assing new values to the arrays... maybe using A = B, is not the best but i'm still have a lot to learn...
Thanks for your attention, in case you could need more code to help me, just let me know. Thanks !