Is there elementwise multiplication in cublas? I am trying to perform these Matlab operations
x .* s
x ./ s
I have on host implementation using for loop and another CUDA one, but I wonder if I missed cublas library function that can do it in an optimized way.
here is my CUDA kernel
__global__ void elementMul(double *A, double *B, double *C){
int i = threadIdx.x;
C[i] = A[i] * B[i];
}