1

I' m having a problem to run a c++ code on a powerful multi core server that uses Ubuntu. The problem is that my app is using less than 10% of one cpu. But same app uses around 100% of one cpu in my i3 notebook that uses a different version of Ubuntu.

My OS:

Linux version 3.11.0-23-generic (buildd@batsu) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #40~precise1-Ubuntu SMP Wed Jun 4 22:06:36 UTC 2014

The server's OS:

Linux version 3.11.0-12-generic (buildd@allspice) (gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu7) ) #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013

At least for now, I do not need to parallelize the code, nor to make my code more efficient. I just want to know how I can achieve 100% use of a core this server.

Could anyone help me?

  • What tool are you using to show the CPU usage? It's possible that the tool is showing the usage as a fraction of *all* CPUs, so if your code is single-threaded it would just appear to use a small fraction of a multiprocessor. – Rufflewind Jan 22 '15 at 21:31
  • I am using the top. But the time taken to run on the server is greater than in my notebook. – Thiago Silva Jan 22 '15 at 21:42
  • I think your approach is wrong. You should be thinking about "how do I make this application calculate/accomplish the things it needs to do most effectively", not "how can I make it such a CPU-constrained program that it uses as much CPU time as possible"... – twalberg Jan 23 '15 at 17:15

1 Answers1

0

It may not be your OS but instead the compiler. Compilers are moving targets, year by year they improve (hopefully) their automatic optimizations. Your code may still be vectorizing and you don't know it. Yes, I realize that you are using a newer compiler on your laptop.

See if you still have the performance delta when you disable all optimizations (-O0 or some such). If you are trying to maximize CPU cycles, you may be using numerical calculations that are easily vectorized. The same goes for parallelization. You can also get general optimization reports as well as a specific vectorization report from gcc. I don't recall the parameter, but you can find it easily on-line.

Also, there is a world of difference between the # of cores on a server (probably a multi-core Xeon) and your i3. Your i3 has 2 cores, each capable of running two hardware threads, meaning you have in effect 4 CPUs. Depending upon your server configuration, you can have up to 18 cores with two hardware threads each in a processor. That translates to 36 effective CPUs. Also, you can have multiple processors per motherboard. You can do the math.

Both the compiler and OS can impact an application's processor use. If you are forking off multiple threads to try and consume processing, the OS can farm those out to different processors, reducing your system wide CPU usage. Even if you are running pure serial code, a smart compiler can break up the code into multiple threads that your threading library may distribute over those 36 effective CPUs.

You OS can also meter how much processing you can hog. If this server is not under your control, the administrator may have established a policy that limits the percent of processing that any one application can consume.

So in conclusion: (1) Disable all optimization (2) Check on individual core CPU usage to see what the load is on all your effective CPUs (3) Restructure your code to farm out tasks that will be distributed across your effective CPUs, each consuming as much processing as possible. (4) Make sure your admin isn't limiting the amount of processing individual applications can consume.

Taylor Kidd
  • 1,463
  • 1
  • 9
  • 11