-1

I am trying to reproduce two opaque data types from the pthreads library in NASM. These data types are pthread_attr_t and cpu_set_t from pthread_attr_setaffinity_np (see http://man7.org/linux/man-pages/man3/pthread_attr_setaffinity_np.3.html).

I created a simple C program to call pthread_attr_setaffinity_np and stepped through it with gdb to examine the format of those two bitmasks (pthread_attr_t is an affinity mask).

When I debug the C version with gdb, I print the values of attr and cpus:

(gdb) p attr
$2 = {__size = '\000' <repeats 17 times>, "\020", '\000' <repeats 37 times>, __align = 0}

(gdb) p cpus
$3 = {__bits = {1, 0 <repeats 15 times>}}

What do those two type formats translate into for assembly language?

Here is the C code:

#define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

void* DoWork(void* args) {
    printf("ID: %lu, CPU: %d\n", pthread_self(), sched_getcpu());
    return 0;
}

int main() {   

    int numberOfProcessors = sysconf(_SC_NPROCESSORS_ONLN);
    printf("Number of processors: %d\n", numberOfProcessors);

    pthread_t threads[numberOfProcessors];

    pthread_attr_t attr;
    cpu_set_t cpus;
    pthread_attr_init(&attr);

    for (int i = 0; i < numberOfProcessors; i++) {
       CPU_ZERO(&cpus);
       CPU_SET(i, &cpus);
       pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpus);
       pthread_create(&threads[i], &attr, DoWork, NULL);
    }

    for (int i = 0; i < numberOfProcessors; i++) {
        pthread_join(threads[i], NULL);
    }

    return 0;
}

Thanks very much for any help.

RTC222
  • 2,025
  • 1
  • 20
  • 53
  • 2
    Look in the headers how the types and helper macros are defined. Also, don't do this in asm if at all possible. – Jester Jan 16 '20 at 19:51
  • It may be best to instantiate threads in a call to a C shared object. I will try that, to circumvent the difficulty of translating these opaque types into assembly. – RTC222 Jan 16 '20 at 19:59
  • `cpu_set_t` is obviously just a flat bitmap, and even documented as such in the man page. The only question is the size of the bitmap. Like I explained in my answer to your previous question. [CPU\_ZERO "undefined symbol" using pthread\_setaffinity\_np in NASM](//stackoverflow.com/a/59638938). From GDB you can see the number of qwords in the current definition. – Peter Cordes Jan 17 '20 at 00:42
  • That's true, and it's 128 bytes -- but my problem now is with the attr argument of pthread_attr_setaffinity_np, not the cpuset argument. – RTC222 Jan 17 '20 at 20:27
  • If you really want to use asm, you *could* always use system calls directly instead of the libpthread wrappers. kernel ABIs are supposed to always be stable, so you could use a `clone` system call to start a thread, after mmaping stack space for it. Otherwise just look at how your C compiles and copy that. – Peter Cordes Jan 18 '20 at 07:41
  • Doing everything from NASM is appealing but not necessary now (and it's a lot more complex). The problem that gave rise to my question at https://stackoverflow.com/questions/59795342/what-are-the-numeric-flag-values-to-call-dlopen-in-assembly was that I was calling a C shared object from a NASM shared object, but I solved it by linking their two object files into one with the proper externs in each source file. Now the linkage problem is solved. The threads are created in the C program that is called from the NASM program; the pthread_create args include the pointer to the NASM function. – RTC222 Jan 20 '20 at 01:10
  • So far it works to create the cores in core order and call the NASM function. I should be finished tomorrow and then I can reply to my most recent posts to explain the issue and the solution. I expect that will also answer the spinlock issue with lock cmpxchg that I posted on Dec 21 (https://stackoverflow.com/questions/59439432/lock-cmpxchg-fails-to-execute-threads-in-core-order) because the spinlock will naturally fail if the threads are not on separate cores in sequential order. – RTC222 Jan 20 '20 at 01:10

1 Answers1

1

It's fairly easy to create threads in NASM using pthreads, but setting the affinity mask is another matter. It turned out to be unnecessary to reproduce the opaque types to use in assembly language, which would be very difficult.

Instead, I compiled the C program to an object file, and linked that object file with the NASM object file to produce the final executable. The main() function in C got a different name because it's not to be compiled to an .exe, and that function name is referenced with an "extern" in the NASM program. Here's the final C code:

#define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

extern void * Test_fn();

int thread_create_in_C() {

    int numberOfProcessors = sysconf(_SC_NPROCESSORS_ONLN);

    if (numberOfProcessors >= 2){ // e.g. virtual cores
        numberOfProcessors = numberOfProcessors / 2; }

    printf("Number of processors: %d\n", numberOfProcessors);

    pthread_t threads[numberOfProcessors];

    pthread_attr_t attr;
    cpu_set_t cpus;
    pthread_attr_init(&attr);

    for (int i = 0; i < numberOfProcessors; i++) {
       CPU_ZERO(&cpus);
       CPU_SET(i, &cpus);
       printf("Core created %d\n", i);
       pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t),    &cpus);
       pthread_create(&threads[i], &attr, Test_fn, NULL);
    }

    for (int i = 0; i < numberOfProcessors; i++) {
        pthread_join(threads[i], NULL);
        printf("Core joined %d\n", i);
    }

    return numberOfProcessors;
}

In the NASM code, we have an "extern thread_create_in_C" directive with the other externs, to reference the C code, and in the C code we have extern void * Test_fn(); to reference the NASM function to be called by each thread.

We call the C program at the appropriate point in the NASM program with:

call thread_create_in_C wrt ..plt

My suggestion to anyone who needs to set affinity masks for threads in assembly language is to use a C program like the one above instead of trying to replicate it in assembly. But for simple thread creation without affinity masks, the pthreads library is all you need.

RTC222
  • 2,025
  • 1
  • 20
  • 53