I have a virtual machine on Google cloud with 1 CPU socket with 16 cores and 2 threads per core (hyper-threading).
This is the output of lscpu
:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Stepping: 0
CPU MHz: 2300.000
BogoMIPS: 4600.00
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 46080K
NUMA node0 CPU(s): 0-31
I'm running my process on it and I'm trying to distribute my threads among the different logical CPUs.
unsigned num_cpus = std::thread::hardware_concurrency();
LOG(INFO) << "Going to assign threads to " << num_cpus << " logical cpus";
cpu_set_t cpuset;
int rc = 0;
for (int i = 0; i < num_cpus - 5; i++) {
worker_threads.push_back(std::thread(&CalculationWorker::work, &(workers[i]), i));
// Create a cpu_set_t object representing a set of CPUs. Clear it and mark
// only CPU i as set.
CPU_ZERO(&cpuset);
CPU_SET(i, &cpuset);
int rc = pthread_setaffinity_np(worker_threads[i].native_handle(),
sizeof(cpu_set_t), &cpuset);
if (rc != 0) {
LOG(ERROR) << "Error calling pthread_setaffinity_np: " << rc << "\n";
}
LOG(INFO) << "Set affinity for worker " << i << " to " << i;
}
The thing is that num_cpus
is indeed 32 but when I run the following code line in every one of the running threads:
LOG(INFO) << "Worker thread " << worker_number << " on CPU " << sched_getcpu();
sched_getcpu()
returns 0 for all threads.
Does it have something to do with the fact that this is a virtual machine?
UPDATE:
I found out that pthread_setaffinity_np
does work, apparently there sere some daemon process running in the background, that's why I saw the other cores being utilised.
however, sched_getcpu
still doesn't work and return 0 on all threads although I can clearly see they run on different cores.