1

My code (see below) produces an odd behaviour. The output is:

Testing whether there are problems with concurrency ...rc is 0. i is 0
.rc is 0. i is 0
.rc is 3. i is 1
.rc is 0. i is 0
.rc is 3. i is 1
.rc is 3. i is 2
.rc is 0. i is 0
.rc is 3. i is 1
.rc is 3. i is 2
.rc is 3. i is 3
.rc is 0. i is 0
.rc is 3. i is 1
.rc is 3. i is 2
.rc is 3. i is 3
.rc is 3. i is 4
.rc is 0. i is 0
Segmentation fault (core dumped)

I tried to debug it, but only found out that i is reset to 0 right after pthread_join. This leads me to the conclusion that the modification must happen somewhere there. But i can't find a thing. I feel kind of stupid, since this isn't really a hard piece of code. What did i not notice?

Operating system is Ubuntu 14.04. N_THREADS is currently set to 10, N_RUNS is 10000.

Main thread:

pthread_t threads[N_THREADS];
pthread_attr_t attr;
int i;
int rc;
int status;

printf("Testing whether there are problems with concurrency ...");

pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

for (i = 0; i < N_THREADS; i++){
  if (i) {
    rc = pthread_create(&(threads[i]), &attr, addRemove, 0);
  } else {
    rc = pthread_create(&(threads[i]), &attr, readStuff, 0);
  }
  if (rc) return rc;
}

for(i = 0; i < N_THREADS; i++) {
  rc = pthread_join(threads[i], (void*) &status);
//  if(rc == 3) 
      printf("rc is %d. i is %d\n", rc, i);
//  if (rc) return rc;
  if (status) return status;
  printf(".");
}

pthread_attr_destroy(&attr);

return 0;

Worker threads:

void* readStuff(void* a)
{
  int i;
  for (i = 0; i< N_RUNS; i++){
  ;
  }
  pthread_exit((void*)0);
}

void* addRemove(void* a)
{
   int i;
   for (i = 0; i< N_RUNS; i++){
   ;
   }
   pthread_exit((void*)0);
}

There are no other threads except the main thread and the ones created in the code above.

Compileable example

Martze
  • 921
  • 13
  • 32
  • 1
    you have two for loop in the join part. Is this a typo? – KiaMorot Feb 11 '15 at 09:12
  • How about providing a complete, compilable example? http://sscce.org/ – davmac Feb 11 '15 at 09:15
  • I added one to the question. – Martze Feb 11 '15 at 09:19
  • Be more careful about your use of casts. `pthread_exit((void*)0)` is unnecessary, either `pthread_exit(NULL)` or `pthread_exit(0)` would be fine. The `pthread_join` call is just wrong, but is hidden by your cast. Declare a `void*` variable and pass its address to `pthread_join` (without a cast) to get the status back. – Jonathan Wakely Feb 11 '15 at 09:35

1 Answers1

2

I think your problem is with the pthread_join. From the man page:

       int pthread_join(pthread_t thread, void **retval);
       ...

       If  retval is not NULL, then pthread_join() copies the exit status of the tar‐
       get  thread  (i.e.,  the  value   that   the   target   thread   supplied   to
       pthread_exit(3))  into  the  location  pointed  to  by *retval.  If the target
       thread was canceled, then PTHREAD_CANCELED is placed in *retval.

Note that it takes a void **, which means it overwrites the thing pointed to by retval with a void * (size 8 on 64 bit). You are passing an int * (i.e. &status), which is a pointer to an object of size 4 on most platforms.

So, pthread_join will be overwriting memory. Instead, declare status as a void * as per the function prototype.

You are also testing status; I don't know what you are trying to achieve here.

In general, compiling with -Wall will show you these errors.

abligh
  • 24,573
  • 4
  • 47
  • 84
  • -Wall -Wextra didn't notify me about that. But it solved the problem! thanks! Am i right if i say this wouldn't have happened on a 32 bit system? – Martze Feb 11 '15 at 09:28
  • Yes (I missed the cast). And yes, `sizeof(void *) == sizeof(int)` on 32 bit. – abligh Feb 11 '15 at 09:31
  • 1
    It wouldn't have written 8 bytes to a 4-byte memory location on 32-bit, but your code would still have been wrong, as `pthread_join` would have written an address to `status` – Jonathan Wakely Feb 11 '15 at 09:32