3

I currently have a multi-threaded C program coded using Pthreads which uses 2 threads. I want to increase the no. of threads and measure speed up upon doing so. I would like to run my code in an automated manner where the no. of threads used keeps getting incremented and I want to graphically display running times of my code. I would love it if I could get a clue in on how to do so especially on how to automate the entire process and plotting it graphically. Here is my code:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#define NUM_THREADS 2
#define VECTOR_SIZE 40

struct DOTdata
{
    /* data */
    long X[VECTOR_SIZE];
    long Y[VECTOR_SIZE];
    long sum;
    long compute_length;
};

struct DOTdata dotstr;
pthread_mutex_t mutex_sum;

void *calcDOT(void *);

int main(int argc, char *argv[])
{
    long vec_index;

    for(vec_index = 0 ; vec_index < VECTOR_SIZE ; vec_index++){
        dotstr.X[vec_index] = vec_index + 1;
        dotstr.Y[vec_index] = vec_index + 2; 
    }

    dotstr.sum = 0;
    dotstr.compute_length = VECTOR_SIZE/NUM_THREADS;

    pthread_t call_thread[NUM_THREADS];
    pthread_attr_t attr;
    void *status;

    pthread_mutex_init(&mutex_sum, NULL);

    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

    long i;

    for(i = 0 ; i < NUM_THREADS ; i++){
        pthread_create(&call_thread[i], &attr, calcDOT, (void *)i);
    }

    pthread_attr_destroy(&attr);

    for (i = 0 ; i < NUM_THREADS ; i++){
        pthread_join(call_thread[i], &status);
    }

    printf("Resultant X*Y is %ld\n", dotstr.sum);
    pthread_mutex_destroy(&mutex_sum);
    pthread_exit(NULL);
}

void *calcDOT(void *thread_id)
{
    long vec_index;
    long start_index;
    long end_index;
    long length;
    long offset;
    long sum = 0;

    offset = (long)thread_id;
    length = dotstr.compute_length;

    start_index = offset * length;
    end_index = (start_index + length) - 1;

    for(vec_index = start_index ; vec_index < end_index ; vec_index++){
        sum += (dotstr.X[vec_index] * dotstr.Y[vec_index]);
    }

    pthread_mutex_lock(&mutex_sum);
    dotstr.sum += sum;
    pthread_mutex_unlock(&mutex_sum);

    pthread_exit((void *)thread_id);

}

I would like to increment my NUM_THREADS parameter and run it after each increment, record the execution time after each increment and plot a graph of execution time vs number of threads.

Abhijeet Mohanty
  • 334
  • 1
  • 6
  • 21
  • Which OS? And to make it clear, have you already managed to make your application work with an arbitrary number of threads or you're also seeking help for that? – giusti Jan 01 '17 at 13:04
  • I have written my code which uses n threads although a constant throughout its execution cycle. My OS is macOS Sierra. – Abhijeet Mohanty Jan 01 '17 at 13:11
  • I can vary my NUM_THREADS parameter and run it separately, but I want to automate the process, record the execution time of each cycle and graphically display it. – Abhijeet Mohanty Jan 01 '17 at 13:19
  • 1
    You can modify your code to take the number of threads from the command-line or standard input. Then you can write a script to call your program several times. I'm not versed in Mac, but I think you can use Bash and `time` to measure your program. – giusti Jan 01 '17 at 13:26
  • Or instead of modifying your program, an easier solution would be to make your script supply `NUM_THREADS` to the compiler as a macro. In GCC you can use the option `-D` for that. Write a script that will compile your program with different numbers of threads, run it, and measure the running time. – giusti Jan 01 '17 at 13:28
  • If this isn't enough to help you find your way, I think your question would be better received in either the Unix or the "Ask Different" community. Check them both to see which one would suit you better. – giusti Jan 01 '17 at 13:29
  • Thanks for the input @g – Abhijeet Mohanty Jan 01 '17 at 13:33
  • this line: `dotstr.compute_length = VECTOR_SIZE/NUM_THREADS;` will not work as expected. This is because it is using an integer divide, to any remainder will be truncated. So this will only work correctly when NUM_THREADS exactly divides into VECTOR_SIZE. When NUM_THREADS is larger than VECTOR_SIZE, the number of threads will be 0 – user3629249 Jan 01 '17 at 16:36
  • the posted code can eliminate all the `attr` statements and just pass NULL as the second parameter to pThreadCreate() – user3629249 Jan 01 '17 at 16:46
  • regarding this line: `pthread_create(&call_thread[i], &attr, calcDOT, (void *)i);` in general, the last parameter should be the address of the data, not casting the data as if it were a pointer. Suggest: `pthread_create(&call_thread[i], &attr, calcDOT, (void *)&i); Then this line: `offset = (long)thread_id;` should be: `offset = (long)(*thread_id);` – user3629249 Jan 01 '17 at 16:52
  • the posted code is not using the returned value from the threads, to avoid clutter, suggest using: `pthread_exit( NULL );` and `pthread_join( call_thread[i], NULL );` – user3629249 Jan 01 '17 at 16:59
  • the posted code needs to be checking the returned value from: `pthread_create()` `pthread_join()` to assure they were successful. – user3629249 Jan 01 '17 at 17:14
  • this line: `pthread_mutex_t mutex_sum;` would be better written as: `pthread_mutex_t mutex_sum = PTHREAD_MUTEX_INITIALIZER;` and eliminate the call to `pthread_mutex_init()` – user3629249 Jan 01 '17 at 17:15
  • an important thing to note: breaking the calculation into numerous threads will SLOW DOWN the program due to context switching, etc. – user3629249 Jan 01 '17 at 17:20
  • `(void *)i` this results in implementation defined behaviour. Didn't you notice some warning during compilation? – babon Jan 01 '17 at 17:42

1 Answers1

2

I tried a naive approach by increasing the number of threads, timing it with time.h and plotting it with gnuplot. Each iteration we double the number of threads and we print the time for an iteration. We use gnuplot to display a graph with number of threads on the x-axis and execution time on the y-axis

enter image description here

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define NUM_THREADS 2
#define VECTOR_SIZE 40

struct DOTdata {
    /* data */
    long X[VECTOR_SIZE];
    long Y[VECTOR_SIZE];
    long sum;
    long compute_length;
};

struct DOTdata dotstr;
pthread_mutex_t mutex_sum;

void *calcDOT(void *);

int main(int argc, char *argv[]) {
    double xvals[VECTOR_SIZE / NUM_THREADS];
    double yvals[VECTOR_SIZE / NUM_THREADS];
    int index = 0;
    for (int count = NUM_THREADS; count < VECTOR_SIZE / NUM_THREADS; count = count * 2) {

        clock_t begin = clock();

        long vec_index;

        for (vec_index = 0; vec_index < VECTOR_SIZE; vec_index++) {
            dotstr.X[vec_index] = vec_index + 1;
            dotstr.Y[vec_index] = vec_index + 2;
        }

        dotstr.sum = 0;
        dotstr.compute_length = VECTOR_SIZE / count;

        pthread_t call_thread[count];
        pthread_attr_t attr;
        void *status;

        pthread_mutex_init(&mutex_sum, NULL);

        pthread_attr_init(&attr);
        pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

        long i;

        for (i = 0; i < count; i++) {
            pthread_create(&call_thread[i], &attr, calcDOT, (void *) i);
        }

        pthread_attr_destroy(&attr);

        for (i = 0; i < count; i++) {
            pthread_join(call_thread[i], &status);
        }

        printf("Resultant X*Y is %ld\n", dotstr.sum);
        pthread_mutex_destroy(&mutex_sum);
        clock_t end = clock();
        double time_spent = (double) (end - begin) / CLOCKS_PER_SEC;

        printf("time spent: %f NUM_THREADS: %d\n", time_spent, count);
        xvals[index] = count;
        yvals[index] = time_spent;
        index++;
    }

    FILE * gnuplotPipe = popen ("gnuplot -persistent", "w");

    fprintf(gnuplotPipe, "plot '-' \n");

    for (int i = 0; i < VECTOR_SIZE / NUM_THREADS; i++)
    {
        fprintf(gnuplotPipe, "%lf %lf\n", xvals[i], yvals[i]);
    }

    fprintf(gnuplotPipe, "e");


    pthread_exit(NULL);
}

void *calcDOT(void *thread_id) {
    long vec_index;
    long start_index;
    long end_index;
    long length;
    long offset;
    long sum = 0;

    offset = (long) thread_id;
    length = dotstr.compute_length;

    start_index = offset * length;
    end_index = (start_index + length) - 1;

    for (vec_index = start_index; vec_index < end_index; vec_index++) {
        sum += (dotstr.X[vec_index] * dotstr.Y[vec_index]);
    }

    pthread_mutex_lock(&mutex_sum);
    dotstr.sum += sum;
    pthread_mutex_unlock(&mutex_sum);

    pthread_exit((void *) thread_id);

}

Output

Resultant X*Y is 20900
time spent: 0.000155 NUM_THREADS: 2
Resultant X*Y is 19860
time spent: 0.000406 NUM_THREADS: 4
Resultant X*Y is 17680
time spent: 0.000112 NUM_THREADS: 8
Resultant X*Y is 5712
time spent: 0.000587 NUM_THREADS: 16
Niklas Rosencrantz
  • 25,640
  • 75
  • 229
  • 424