I have a multithreaded code in C, using OpenMP and Intel MKL functions. I have the following code:
omp_set_num_threads(nth);
#pragma omp parallel for private(l,s) schedule(static)
for(l=0;l<lines;l++)
{
for(s=0;s<samples;s++)
{
out[l*samples+s]=mkl_ddot(&bands, &hi[s*bands+l], &inc_one, &hi_[s*bands+l], &inc_one);
}
}//fin for l
I want to use all the cores of the multicore processor (the value of nth) in this pramga. But I want that each core computes a single mkl_ddot function independently (1 thread per mkl_ddot function).
I want to know how many threads are used by the mkl_ddot function in this case. I read in some forums, that by default mkl functions inside a pragma parallel run using only 1 cores (thats what i want). But I am not sure about this behaviour and I can not find the specific section in the manual explaining this situation.
Thanks in advance.