1

I am trying to use a custom matvec operator with PETSc MatShell in Fortran and inside of it, I want to use a mix of OpenMP and MKL multithreading (blas).

The OpenMP and MKL threads are indeed launched, but htop shows that only the OpenMP threads occupy 200% of the CPU (2 threads at 100%) when there are 48 cores available.

I can indeed see the rest of the threads (MKL) on htop but they use 0% of CPU.

How can I achieve this?

Edit: I'm glad to post more details. I'm shooting for a short message first in case someone has run into the same issue.

Astor
  • 288
  • 1
  • 7
  • PETSc is based on MPI parallelism. Are you running just one petsc process? – Victor Eijkhout Apr 07 '23 at 13:01
  • I am using MPI. OpenMP and MKL are not being used by PETSc regardless of the amount of MPI processes. – Astor Apr 07 '23 at 13:45
  • If you're using MPI, say 12 processes on your 48 cores, then each process can only have 4 threads. At least, most job starters will set it up that way. – Victor Eijkhout Apr 07 '23 at 14:49
  • @VictorEijkhout Problem is, with 1 MPI process, two OpenMP threads are launched, and MKL is working single-core. There is some interaction between PETSc and the rest of the software making MKL not using multithreading. – Astor Apr 08 '23 at 15:07
  • MKL has different libraries for sequential/multi-threaded. You need to link the right one. – Victor Eijkhout Apr 09 '23 at 03:17
  • Thanks @VictorEijkhout It is indeed linking to the parallel ones: "-lmkl_intel_lp64 -lmkl_core -lmkl_intel_thread -lmkl_blacs_openmpi_lp64" It is just not working inside the matvec operator of a MatShell – Astor Apr 10 '23 at 04:04
  • Could you let us know the OS details, hardware, and complete steps you have followed with the sample reproducer code to investigate your issue more? – Varsha - Intel Apr 14 '23 at 17:36
  • Thank you @Varsha-Intel, the problem seems to be related to MKL only, could you please take a look at https://stackoverflow.com/questions/76001117/mkl-blas-not-multithreading-zgemv ? – Astor Apr 15 '23 at 17:11

0 Answers0