0

I have a simple question for you. I made this code to calculate the factorial of a number without recursion.

int fact2(int n){
    int aux=1, total = 1;
    int i;
    int limit = n - 1;
    for (i=1; i<=limit; i+=2){
        aux = i*(i+1);
        total = total*aux;
    }
    for (;i<=n;i++){
        total = total*i;
    }
return total;

}

As you can see, my code uses loop unrolling to optimize clock cycles in the execution. Now I'm asked to add two-way parallelism to the same code, any idea how?

franpen
  • 322
  • 1
  • 4
  • 18
  • n! = n * (n-1) * ... * (n/2) * ... * 1. You can have one CPU do the first n/2 multiplications, the other CPU do the rest, then multiply the 2 results together. – Mark Plotnick Nov 21 '13 at 01:41

1 Answers1

2

You can use ptherads library to create two separate threads. Each thread should do half of the multiplications. I could put together following solution.

#include <pthread.h>

typedef struct {
    int id;
    int num;
    int *result;
} thread_arg_t;

void* thread_func(void *arg) {
    int i;
    thread_arg_t *th_arg = (thread_arg_t *)arg;
    int start, end;
    if(th_arg->id == 0) {
        start = 1;
        end = th_arg->num/2;
    } else if (th_arg->id == 1) {
        start = th_arg->num / 2;
        end = th_arg->num + 1;
    } else {
        return NULL;
    }
    for(i=start; i < end; i++) {
            th_arg->result[th_arg->id] *= i;
    }
    return NULL;
}

int factorial2(int n) {
    pthread_t threads[2];
    int rc;
    int result[2];
    thread_arg_t th_arg[2];
    for(i=0; i<2; i++) {
        th_arg[i].id = i;
        th_arg[i].num = n;
        th_arg[i].result = result;
        rc = pthread_create(&threads[i], NULL, thread_func, (void *)&th_arg[i]);
        if (rc){
         printf("pthread_create() failed, rc = %d\n", rc);
         exit(1);
      }
    }

    /* wait for threads to finish */
    for(i=0; i<2; i++) {
      pthread_join(thread[i], NULL);

    /* compute final one multiplication */
    return (result[0] * result[1]);
}

The pthread library implementation should take care of parallelizing the work of two threads for you. Also, this example can be generalized for N threads with minor modifications.

Punit Soni
  • 1,229
  • 3
  • 17
  • 26
  • How much slower is that than the plain old sequential solution? That's not your fault, but the number of multiplications that you can do on a factorial and not overflow a plain `int` is about 12 (including multiply by 1), and the overhead of creating and destroying the threads is astronomical by comparison. The range goes up to 20 for 64-bit `int` type; still not enough to offset the cost of thread creation. – Jonathan Leffler Nov 21 '13 at 02:19
  • you are right, for integer multiplications, there may not be any improvement by parallelizing. How about floating point case? – Punit Soni Nov 22 '13 at 19:20
  • See [Amdahl's Law](http://en.wikipedia.org/wiki/Amdahl's_law) for some explanation. The overhead of thread setup (and teardown) is not negligible, so the amount of computation that the thread has to do to be worth creating is also considerable. If you were using an exact 'big number' arithmetic package and calculating N! for values of N in the hundreds, then you would gain some benefit. For floating point arithmetic, you might see some benefit if you had large enough values of N, but my suspicion is that you wouldn't see any benefit; the computation simply isn't heavy enough. – Jonathan Leffler Nov 22 '13 at 21:41
  • Got it. For the particular problem of calculating a factorial, is not worth parallelizing basically. Not bad for learning parallel computing using this exercise though. – Punit Soni Nov 25 '13 at 22:57
  • Yes; as a tutorial exercise, it is fine. For production work, it would be silly. Just remember the difference when you need to apply the ideas to 'real life' situations. And note, as I said up front, the issue I'm raising is inherent in the question asked by the teacher; your answer addresses the problem perfectly adequately given the constraints of what is required. – Jonathan Leffler Nov 25 '13 at 23:37