0

I am looking for a very simple example to demonstrate a deadlock using pthread_join; however, this is not trivial.

I started with this:

void* joinit(void* tid)
{
  pthread_t* tid_c = (pthread_t*)tid;
  int retval = pthread_join(*tid_c, NULL);
  printf("In joinit: tid = %d, retval = %d \n", *tid_c, retval);
  return NULL;
}

int main()
{
  pthread_t thread1;
  pthread_t thread2;

  pthread_create(&thread1, NULL, joinit, &thread2);
  pthread_create(&thread2, NULL, joinit, &thread1);

  pthread_join(thread2, NULL);  

  return 0;
}

But however, it says 'EINVAL' (invalid argument) because thread2 is not yet specified when pthread_create for thread1 is called.

Any ideas?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
今天春天
  • 941
  • 1
  • 13
  • 27
  • I tried this and it worked, but the problem is, it only works SOMETIMES because sometimes I get EIVAL again. I AM checking errno in the method joinit! – 今天春天 Mar 19 '16 at 20:06
  • Yes you don't get to decide when a thread is started, the OS does that. – Iharob Al Asimi Mar 19 '16 at 20:06
  • Yes, I know. I need to have a reliable deadlock situation. – 今天春天 Mar 19 '16 at 20:07
  • Also, you `pthread_join(thread2, NULL)` in `main()` in the main thread, the other `pthread_join()` can cause a race condition. – Iharob Al Asimi Mar 19 '16 at 20:08
  • If I didn't do this, the main thread will return, exit will be called and all threads will die. – 今天春天 Mar 19 '16 at 20:09
  • 2
    You'll need to make sure that the threads don't access their parameters until both threads have been created — you need some additional synchronization between the threads. A semaphore would be one option. You could perhaps do it with a pair of mutexes passed to the threads (along with the other threads `pthread_t *` — so you'd end up with a structure being passed). In this case, the main code would create the structures and the mutexes and lock the mutexes. After both threads are created, the main thread would release the mutexes. Should you join the other thread in `main` too? – Jonathan Leffler Mar 19 '16 at 20:12
  • Okay, now I found a way...I created a third thread which will run a while(true) and the main thread will join this. Then the other two threads will have enough time to join each other and a deadlock is created... – 今天春天 Mar 19 '16 at 20:12
  • Thank you @JonathanLeffler! I will have a look at this. – 今天春天 Mar 19 '16 at 20:14

2 Answers2

2

If you're just wanting to demonstrate that a pthread_join can cause a deadlock, you could do something similar to the following code:

#include <stdio.h>
#include <pthread.h>

void* joinit(void* tid)
{
    printf("In %#x, waiting on %#x\n", pthread_self(), (*((pthread_t*)tid)));
    pthread_join((*((pthread_t*)tid)), NULL);
    printf("Leaving %#x\n", pthread_self());
    return NULL;
}

int main(void)
{
    pthread_t thread1 = pthread_self();
    pthread_t thread2;
    pthread_create(&thread2, NULL, joinit, &thread1);
    joinit(&thread2);
    return 0;
}

This will cause the main thread to wait on the spawned thread and the spawned thread to wait on the main thread (causing a guaranteed deadlock) without the need for extra locking primitives to clutter up what you are trying to demonstrate.

And to answer some of your questions more directly:

it says 'EINVAL' (invalid argument) because thread2 is not yet specified when pthread_create for thread1 is called.

... and from one of your comments ...

I tried this and it worked, but the problem is, it only works SOMETIMES because sometimes I get EINVAL again.

In your code, you call pthread_create consecutively to spawn the 2 threads:

pthread_create(&thread1, NULL, joinit, &thread2);
pthread_create(&thread2, NULL, joinit, &thread1);

In your joinit code, you grab the thread handle passed in to join on:

pthread_t* tid_c = (pthread_t*)tid;
int retval = pthread_join(*tid_c, NULL);

The reason this sometimes works and others you'll get EINVAL has to do with time slices allocated to each thread's context and sequencing. When the first pthread_create is called you will have a valid handle to thread1 after it returns but the handle to thread2 is not valid yet, at least not until the 2nd pthread_create is called.

To this, when a thread is created, the act of the thread coming "alive" (i.e. the thread function actually running) could take some extra time even though the thread handle returned is valid. In these instances, there is a chance one thread can execute more code than might be "expected". In your code, both pthread_create functions might happen to have been called in the time slice allocated for the main thread which could give each spawned thread enough "time" before hitting the pthread_join statement allowing tid_c to point to a valid handle; in the EINVAL case, pthread_create(&thread1, NULL, joinit, &thread2) was called and the spawned thread hit the pthread_join(*tid_c, NULL) before pthread_create(&thread2, NULL, joinit, &thread1) could give thread2 a valid handle (causing the error).

If you wanted to keep your code similar to how it is now, you would need to add a lock of some sort to ensure the threads don't exit or call anything prematurely:

#include <stdio.h>
#include <pthread.h>

static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;

void* joinit(void* tid)
{
    /* this can be above the lock because it will be valid after lock is acquired */
    pthread_t* tid_c = (pthread_t*)tid;
    int retval = -1;
    pthread_mutex_lock(&lock);
    pthread_mutex_unlock(&lock);
    printf("%#x waiting on %#x\n", pthread_self(), *tid_c);
    retval = pthread_join(*tid_c, NULL);
    printf("In joinit: tid = %d, retval = %d \n", *tid_c, retval);
    return NULL;
}

int main()
{
    pthread_t thread1;
    pthread_t thread2;
    /* get the lock in the main thread FIRST */
    pthread_mutex_lock(&lock);
    pthread_create(&thread1, NULL, joinit, &thread2);
    pthread_create(&thread2, NULL, joinit, &thread1);
    /* by this point, both handles are "joinable", so unlock  */
    pthread_mutex_unlock(&lock);

    /* can wait on either thread, but must wait on one so main thread doesn't exit */
    pthread_join(thread2, NULL);
    return 0;
}

Hope this can help.

txtechhelp
  • 6,625
  • 1
  • 30
  • 39
  • its worth mentioning that running just your code will not make OP aware if deadlock has happened. You should check return val of pthread_join for that. In this case pthread_join will return 35 (EDEADLK) and any one of thread will simply return. `int ret=pthread_join((*((pthread_t*)tid)), NULL); printf("%d %s\n",ret,strerror(ret));` output: In 0x7fb4740, waiting on 0x77ca700 In 0x77ca700, waiting on 0x7fb4740 35 Resource deadlock avoided Leaving 0x77ca700 0 Success Leaving 0x7fb4740 In essence here pthread_join handled deadlock for us and skipped waiting part. – kalimba Jul 14 '17 at 06:50
  • @kalimba .. actually running just my code causes a deadlock on some systems (like Windows and OpenBSD); others it _might_ return a value that _might_ indicate that a deadlock condition occurred (e.g. a kernel setting). On my Slackware system `EDEADLK` is defined as `22` (not `35`). POSIX is a loose standard, so while checking the return value of `pthread_join` is important, it's well more important to understand what a dead lock is and to avoid it in the first place (as the OP was asking to demonstrate one and presumably would avoid such a case in the first place). – txtechhelp Jul 14 '17 at 17:36
  • appreciate your response :) My information was limited to posix only. – kalimba Jul 16 '17 at 08:55
0

The main reason for your error is that you have two threads each waiting for the same thread to terminate because of the call to pthread_join in main. Another problem is that you don't ensure each thread correctly sees the other thread's ID.

Fix it like this:

#include <stdio.h>
#include <pthread.h>

pthread_t thread1;
pthread_t thread2;
pthread_mutex_t mutex;
pthread_cond_t cond;
int go = 0;


void* joinit(void* ptr)
{
  // wait until both thread IDs are known
  pthread_mutex_lock(&mutex);
  while (go == 0)
      pthread_cond_wait(&cond, &mutex);
  pthread_mutex_unlock(&mutex);

  pthread_t* tid_c = *((pthread_t**) ptr);
  printf("About to wait\n");
  int retval = pthread_join(*tid_c, NULL);
  printf("In joinit: tid = %d, retval = %d \n", *tid_c, retval);

  // tell the other threads we're done
  pthread_mutex_lock(&mutex);
  go++;
  pthread_cond_broadcast(&cond);
  pthread_mutex_unlock(&mutex);

  return NULL;
}

int main()
{
  // setup synchronization
  pthread_mutex_init(&mutex, NULL);
  pthread_cond_init(&cond, NULL);

  pthread_create(&thread1, NULL, joinit, &thread2);
  pthread_create(&thread2, NULL, joinit, &thread1);

  // tell the threads to go
  pthread_mutex_lock(&mutex);
  go = 1;
  pthread_cond_broadcast(&cond);
  pthread_mutex_unlock(&mutex);

  // wait for both threads to finish
  pthread_mutex_lock(&mutex);
  while (go != 3)
      pthread_cond_wait(&cond, &mutex);
  pthread_mutex_unlock(&mutex);

  return 0;
}
David Schwartz
  • 179,497
  • 17
  • 214
  • 278