0

I am implementing an OpenMP multithreaded program on following machine.

x86_64, On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2

It is a multithreaded clustering program. It shows expected speedup for dataset size upto 2 mil rows ~ 250 MB data but while testing on larger dataset, many of the threads in htop shows D state and CPU% substantially less than 99-100%. Note that for datasize upto this size, every thread runs in R state CPU% ~100%. The running time becomes ~100 times more than sequential case.

Free memory seems to be available and swp memory is 0 for all cases.

Regarding data structures used, there are 3 shared data structures size O(n) and then each thread is creating its private linked list that is stored for merging step further. I suspected its because of the extra memory utilised by this per thread data structure, but even if I comment it out program shows the same problem. Please let me know if I should provide more details.

I have only picked up OpenMP and parallel computing few months back so please let me know what can be the possible problems?

Hardik Malhotra
  • 114
  • 1
  • 10
  • How many threads do you use? – Gilles Nov 17 '16 at 09:48
  • For size greater than 2 mil rows, it drastically slow downs for any level of multithreading. Even for 2 threads – Hardik Malhotra Nov 17 '16 at 09:58
  • Hard to tell but I'd guess you have a bug. Please post a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve). – Gilles Nov 17 '16 at 10:02
  • It tends to be a large project with modular code structure and seems difficult to share. I can share the snapshot of htop screen while running as otherwise output is really indefferent to the running time. Otherwise, it will be helpful if you could suggest some commands/ ways to evaluate the multithreaded code wrt memory and efficient thread sharing – Hardik Malhotra Nov 17 '16 at 10:21

0 Answers0