I try to analyse the scaling behaviour of a C++-program that I have parallelised with Intel OpenMP and the Intel Composer XE 2014. When I run a "Advanced Hotspot Analyses", I get as a result, that a library function called "kmp print storage map gtip" consumes the second longest part of the total runtime. I googled for the meaning of this routine, but I didn't get results. Is this routine related to the std::map datastructures, that I am using in this part of the algorithm? Thanks in advance!
EDIT Now I removed one barrier and could speedup everything. But now a new Hotspot comes into play. Suddenly When I do a Locks & Wait analysis I have at the first position "OMP Join Barrier mkl_blas_daxpy_omp:115" and "OMP Join Barrier mkl_blas_dcopy:155"". But I don't call any mkl routine explicitly. How can I investigate this further?