0

I've written a little test program that spawns a large number of threads (in my case 32 threads on a computer with 4 cores) and pins them all to one core with the pthread_setaffinity_np syscall.

These threads run in a loop in which they report the result of the sched_getcpu call via stdout and then sleep for a short time. What I wanted to see is, how strictly the OS adheres to a user's thread pinning settings (even if they don't make sense as in my case). All threads report to be running on the core I've pinned them to, which is what I would have expected.

However, I've noticed, that while the program is running, cpu utilization on all 4 cores is around 100% (normally it's between 0% and 25%). Could someone enlighten me as to why this would be the case? I would have expected the utilization on the pinned core to be maximal with it being maybe a little higher on the other cores to compensate.

I can append my code if necessary, but I figured it's pretty straightforward and thus not really necessary. I did the test on a fairly old PC with Ubuntu 18.04.

Update

#define _GNU_SOURCE

#include <assert.h>
#include <pthread.h>
#include <sched.h>
#include <stdio.h>
#include <unistd.h>

#define THREADS 32
#define PINNED 3
#define MINUTE 60
#define MILLISEC 1000

void thread_main(int id);

int main(int argc, char** argv) {
    int i;
    pthread_t pthreads[THREADS];

    printf("%d threads will be pinned to cpu %d\n", THREADS, PINNED);

    for (i = 0; i < THREADS; ++i) {
        pthread_create(&pthreads[i], NULL, &thread_main, i);
    }

    sleep(MINUTE);

    return 0;
}

void thread_main(int id) {
    printf("thread %d: inititally running on cpu %d\n", id, sched_getcpu());

    pthread_t pthread = pthread_self();
    cpu_set_t cpu_set;

    CPU_ZERO(&cpu_set);
    CPU_SET(PINNED, &cpu_set);

    assert(0 == pthread_setaffinity_np(pthread, sizeof(cpu_set_t), &cpu_set));

    while (1) {
        printf("thread %d: running on cpu %d\n", id, sched_getcpu());
        //usleep(MILLISEC);
    }
}

When I close all background activity utilization is not quite 100%, but definitely affects all 4 cores to a significant degree.

Community
  • 1
  • 1
Oliver
  • 353
  • 1
  • 2
  • 11
  • 2
    We need to see the code, as from what you describe this should not happen, so the code might have a bug. – nos Aug 17 '18 at 10:19
  • It looks like the threads printf forever to stdout with 100% CPU? Where do you put them to sleep? What else is there to expect if you don't? – Lundin Aug 17 '18 at 13:44
  • I only commented out the usleep for testing, sorry. My issue is that all CPUs spike, although all threads appear to run on only one cpu. – Oliver Aug 17 '18 at 13:53
  • You most definitely should *not* put *any* side-effecty stuff in `assert(/* here */)` because when `NDEBUG` is defined, your `assert`ions will be stripped away... and so will that logic. I'd suggest something like `int fubar = pthread_setaffinity_np(...); assert(!fubar);`... – autistic Aug 18 '18 at 00:52
  • Additionally, `thread_main` should be declared to return `void *`. You're likely seeing (at the very least) a warning about this... Please fix your warnings, or ask about them, before you ask about anything else. – autistic Aug 18 '18 at 00:56
  • `sched_getcpu()` are you having any problems with this? If not, why is it in your testcase? While you're fixing your testcase, I suggest thinking about whether we actually *need* to see some of this code... Okay, we need it if it contributes towards *seeing the problem*... right? So if your code needs to compile to display the symptoms, you can't give us code that doesn't compile, for example... but `sched_getcpu()`... I do not think this is relevant for you, so you should form a testcase without it... right? – autistic Aug 18 '18 at 01:01
  • 2
    If you're running these in a pseudo-terminal, then another process is receiving all of that `printf` output and processing it, which requires CPU time as well. That process (your terminal, likely also Xorg) is going to show up heavily in profiles. Consider that graphically rendering that text output is going to be far more CPU-intensive than the `printf()` that generates it. Try running your test process with output redirected to `/dev/null`. – caf Aug 18 '18 at 01:47

1 Answers1

0

@caf

If you're running these in a pseudo-terminal, then another process is receiving all of > that printf output and processing it, which requires CPU time as well. That process (your terminal, likely also Xorg) is going to show up heavily in profiles. Consider > that graphically rendering that text output is going to be far more CPU-intensive than the printf() that generates it. Try running your test process with output redirected to /dev/null.

This is the correct answer, thanks.

With the output directed to /dev/null the CPU usage spikes are restricted to the CPU that has all the threads pinned to it.

Oliver
  • 353
  • 1
  • 2
  • 11