1

I am bit confused what is a better way to use malloc()/free() in openmp parallel for loop. Here are two ways I thought of but I am not aware of which method is better. I learned from previous answers that malloc/free in loop can fragment the memory.

Suppose I have a loop which runs over million times

for (size_t i = 0 ; i< 1000000; ++i){
    double * p = malloc(sizeof(double)*FIXED_SIZE); 

    /* FIXED_SIZE is some size constant 
    for the entire loop but is only determined dynamically */

    ....... /* Do some stuff using p array */

    free(p);
}

Now I want to parallelize the above loop with openmp

Method -1. simply adding a pragma on top of for loop

#pragma omp parallel for
for (size_t i = 0 ; i< 1000000; ++i){

    #pragma omp atomic
    double * p = malloc(sizeof(double)*FIXED_SIZE); 
    
    ....... /* Do some stuff using p array */

    #pragma omp atomic
    free(p);
}

Method -2. allocate a common array outside loop for each thread

int num_threads = omp_get_num_threads();
double * p = malloc(sizeof(double)*FIXED_SIZE * num_threads); 

#pragma omp parallel for
for (size_t i = 0 ; i< 1000000; ++i){

    int thread_num = omp_get_thread_num();

    double * p1 = p + FIXED_SIZE*thread_num ;
    
    ....... /* Do some stuff using p1 array */
}
free(p);
PierU
  • 1,391
  • 3
  • 15
Aditya Kurrodu
  • 325
  • 2
  • 9
  • 2
    Regardless of parallelization, it seems to me like the allocation inside the `for` is completely unneeded. You should definitely do it outside *once* instead of inside one million times. Why even do it inside in the first place? – Marco Bonelli May 20 '23 at 16:11
  • 1
    Even if each thread needed its own allocation, you could still pull those out of the loop by creating a parallel region broader than just the loop. – John Bollinger May 20 '23 at 17:24
  • `malloc` uses operating system functions or something deep in the runtime environment. Thus it is very likely that you have race conditions. – Victor Eijkhout May 21 '23 at 15:18
  • @VictorEijkhout If using `malloc()` from multiple threads causes a problem, the environment is broken. It's probably not a good idea to do that as it will likely be **S-L-O-W**, but it shouldn't fail or break things. – Andrew Henle May 23 '23 at 18:31
  • @AndrewHenle Can you quote a source that `malloc` is threadsafe? You may be right but I need convincing. – Victor Eijkhout May 23 '23 at 22:31
  • @VictorEijkhout [**7.1.4 Use of library functions**, p5 is relevant](https://port70.net/~nsz/c/c11/n1570.html#7.1.4), if not an explicit guarantee. See also [footnote 189](https://port70.net/~nsz/c/c11/n1570.html#note189). POSIX requires it - so on Linux it will be. See [**Are functions in the C standard library thread safe?**](https://stackoverflow.com/questions/19974548/are-functions-in-the-c-standard-library-thread-safe). And [`FILE`-based streams are required by the C standard to be thread-safe](https://port70.net/~nsz/c/c11/n1570.html#7.21.2p7). – Andrew Henle May 23 '23 at 23:27

1 Answers1

2

First create a parallel block, allocate resource for each thread and next split threads to do a parallel loop.

#pragma omp parallel
{
  double * p = malloc(sizeof(double)*FIXED_SIZE);

  #pragma omp for
  for (size_t i = 0 ; i< 1000000; ++i) { ... }

  free(p);
}
tstanisl
  • 13,520
  • 2
  • 25
  • 40