1

My question: how to make multi-threads use 100% CPU (at least 80%), say let 4 threads use 4 cores to 100%?

The whole story: I wrote some posix multi-threads code. When run it on a multi-core (up to 16) cluster server, although the wall time decrease with using more cores, the overall computation time ( the sum of time spent on the part of code ~without~ any synchronization ). I guess it is because some cores did not dedicate to running my thread. My guess is confirmed when run the code on my laptop.

I tried to assign affinity with pthread_setaffinity_np and to set priority pthread_attr_setschedpolicy(&attributes_, SCHED_FIFO); pthread_attr_setschedparam(&attributes_, &p);

which doesn't help.

Yao Zhao
  • 29
  • 4
  • Are your threads CPU bound? – Christian.K Dec 07 '12 at 18:51
  • Hi, Christian I think it is CPU bound. It always kills a single CPU. very few I/O. – Yao Zhao Dec 07 '12 at 19:35
  • Maybe your threads are all contending for the same memory regions, forcing the CPU caches to be cold most of the time? – Celada Dec 07 '12 at 19:52
  • 1
    You cannot take advantage of 16 processor cores with 4 threads, you'll only use 4 of them. Trying to scale that up tends to produce bugs instead of faster code. – Hans Passant Dec 07 '12 at 19:59
  • Hi Celada, how to detect the memory problem you mentioned? – Yao Zhao Dec 07 '12 at 20:40
  • Hi Hans, I do know that. My code was for dynamic load balancing. The total amount of computation work is fixed. It's free to add thread from 1 to 16. – Yao Zhao Dec 07 '12 at 20:43
  • @YaoZhao use the @ sign in front of user names to have their attention. Regarding thread contention: they will access the same memory regions if they read/write to the same variables. It is also possible that they access different data which get mapped to the same areas in cache (the same cache lines). – didierc Dec 09 '12 at 04:14
  • @didierc thank you for your remind. The computation part does not including read/write to the same variables. So I guess it might be the false sharing issue. Could you give me more advice on how to deal with the cache problem? – Yao Zhao Dec 09 '12 at 06:19
  • @Celada Even if this were true, every thread would still be reported by the OS as taking 100% CPU time each. YaoZhao, if you get a total CPU time that is less than (100*N)% with N threads, it means that some of them are blocked in a system call (lock contention?). – Armin Rigo Dec 13 '12 at 14:20
  • @ArminRigo you're absolutely right. Scratch that theory I guess :-( – Celada Dec 13 '12 at 21:42

0 Answers0