printf performance issue in openmp

Question

I have been told not to use printf in openmp programs as it degrades the performance of parallel simulation program.

I want to know what is the substitute for that. I mean how to display the output of a program without using printf.

I have the following AES-128 simulation problem using openmp which needs further comments Parallel simulation of AES in C using Openmp

I want to know how to output the cipher text without degrading the simulation performance?

Thanks in advance.

Hristo Iliev · Accepted Answer · 2013-11-20T17:02:11.360

You cannot both have your pie and eat it. Decide if you want to have great parallel performance or if it's important to see the output of the algorithm while running the parallel loop.

The obvious offline solution is to store the plaintexts, keys and ciphertexts in arrays. In your case that would require 119 MiB (= 650000*(3*4*16) bytes) in the original case and only 12 MiB in the case with 65000 trials. Nothing that a modern machine with GiBs of RAM cannot handle. The latter case even even fits in the last-level cache of some server-class CPUs.

#define TRIALS 65000

int (*key)[16];
int (*pt)[16];
int (*ct)[16];

double timer;

key = malloc(TRIALS * sizeof(*key));
pt = malloc(TRIALS * sizeof(*pt));
ct = malloc(TRIALS * sizeof(*ct));

timer = -omp_get_wtime();

#pragma omp parallel for private(rnd,j)
for(i = 0; i < TRIALS; i++)
{
   ...

   for(j = 0; j < 4; j++)
   {
      key[i][4*j]   = (rnd[j] & 0xff);
      pt[i][4*j]    = key[i][4*j];
      key[i][4*j+1] = ((rnd[j] >> 8)  & 0xff) ; 
      pt[4*j+1]     = key[i][4*j+1];
      key[i][4*j+2] = ((rnd[j] >> 16) & 0xff) ;
      pt[i][4*j+2]  = key[i][4*j+2];
      key[i][4*j+3] = ((rnd[j] >> 24) & 0xff) ;
      pt[i][4*j+3]  = key[i][4*j+3];
   }

   encrypt(key[i],pt[i],ct[i]);
}

timer += omp_get_wtime();
printf("Encryption took %.6f seconds\n", timer);

// Now display the results serially
for (i = 0; i < TRIALS; i++)
{
    display pt[i], key[i] -> ct[i]
}

free(key); free(pt); free(ct);

To see the speed-up, you have to measure only the time spent in the parallel region. If you also measure the time it takes to display the results, you will be back to where you started.

Hristo, I have no words to thank you. I don't have enough points to give you any points but my heartfelt thanks to you. Thanks very much to you and the entire stackoverflow community for their super contributions — user2979872, Nov 20 '13 at 14:54

printf performance issue in openmp

1 Answers1

Linked