2

I have a complex program which should use all cores to perform complex math calculations.

I have a system with two Intel Xeon Platinum 8160. Each of them has 24 cores so together I have 48 cores and 96 threads.

My program only uses 24 cores and not all 48. It works on the 24 of the first CPU or the 24 of the second one but not all together.

When I start a second instance of the program, then nothing changed only one CPU is used.

I attach some screenshots.

enter image description here

I extracted some code to a minimal working example, which checks how many threads are available. Only 48 are detected and not all 96 threads.

#include <stdlib.h>
#include <stdio.h>
#include <winsock.h>
#include <math.h>
#include <process.h>

static void thread_start(void *thread) {
    int i;
    i = *(int*)thread;
    for (;;) {
        i = (int)sqrt(i++);
    }
}

int main (int argc, char * argv[]) {
    SYSTEM_INFO sysi;
    int thread_max, i;

    argc = argc;
    argv = argv;

    GetSystemInfo(&sysi);
    thread_max = sysi.dwNumberOfProcessors;

    printf("\n... thread_max=%d\n", thread_max);

    printf("\n\n");

    for (i = 0; i < thread_max *2; i++) {
        _beginthread(thread_start, 0, &i);
    }

    for (;;) i = i;

    // return EXIT_SUCCESS;

}

enter image description here

My Machine runs under Windows 10 64-bit Pro. What could be the problem?

Student
  • 805
  • 1
  • 8
  • 11
Felix
  • 5,452
  • 12
  • 68
  • 163
  • what does the set affinity in task manager say? – 0___________ Jul 28 '18 at 12:39
  • what is set affinity? – Felix Jul 28 '18 at 12:56
  • 1
    @Felix [Thread Affinity is the ability to tell the OS that you want a thread scheduled on a particular core if possible](https://learn.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-setthreadaffinitymask) – Mgetz Jul 28 '18 at 13:01
  • where can I check that? – Felix Jul 28 '18 at 13:03
  • 4
    A machine with that many cores uses a numa architecture, separate processor chips each with their own memory bus, glued together with an interconnect that is needed when data needs to be shoveled from one to the other. Enshrined in the winapi as well, albeit for a different reason, these processors are organized into separate groups. Each group can't have more than 64 cores. You are only using one group and therefore can see only half of the cores. https://learn.microsoft.com/en-us/windows/desktop/procthread/processor-groups – Hans Passant Jul 28 '18 at 13:10
  • Task manager -> details => right click on the process – 0___________ Jul 28 '18 at 13:17
  • each process is by default assigned to exactly one group – RbMm Jul 28 '18 at 14:21
  • @HansPassant how can I use both groups? – Felix Jul 28 '18 at 17:21
  • You probably shouldn't, that interconnect can very easily turn into a significant bottleneck. Google "numa programming techniques" to find the tools you need to do it correctly. I'll volunteer my machine to swap with yours, another very easy fix :) – Hans Passant Jul 28 '18 at 17:26
  • ;) and what is the reason for when I run two instances of the same exe the second one does not use the idle 24 cores? – Felix Jul 28 '18 at 17:54
  • an further hints? – Felix Feb 11 '19 at 06:34

1 Answers1

0

GetSystemInfo and dwNumberOfProcessors return the number of processors in the current group, not the total number of hardware processors. See Microsoft docs

The number of logical processors in the current group. To retrieve this value, use the GetLogicalProcessorInformation function.

You should instead call Get­Active­Processor­Count with the ALL_PROCESSOR_GROUPS parameter. This counts up all processors across all groups. As recommended by Raymond Chen

MW_dev
  • 2,146
  • 1
  • 26
  • 40