0

This is some sort of a Java Puzzler, that I stumbled across and can't really explain. Maybe somebody can?

The following program hangs after a short time. Sometimes after 2 outputs, sometimes after 80, but almost always before terminating correctly. You might have to run it a few times, if it doesn't happen the first time.

public class Main {
    public static void main(String[] args) {
        final WorkerThread[] threads = new WorkerThread[]{ new WorkerThread("Ping!"), new WorkerThread("Pong!") };
        threads[0].start();
        threads[1].start();

        Runnable work = new Runnable() {
            private int counter = 0;
            public void run() {
                System.out.println(counter + " : " + Thread.currentThread().getName());
                threads[counter++ % 2].setWork(this);
                if (counter == 100) {
                    System.exit(1);
                }
            }
        };

        work.run();
    }
}

class WorkerThread extends Thread {
    private Runnable workToDo;

    public WorkerThread(String name) {
        super(name);
    }

    @Override
    public void run() {
        while (true){
            if (workToDo != null) {
                workToDo.run();
                workToDo = null;
            }
        }
    }

    public void setWork(Runnable newWork) {
        this.workToDo = newWork;
    }
}

Now, it's clear that busy waiting loops are not a great idea in general. But this not about improving, it's about understanding what is happening.

Since everything works as expected when WorkerThread.setWork() is synchronized or when the WorkerThread.workToDo field is set to volatile I suspect a memory issue.

But why exactly is it happening? Debugging doesn't help, once you start stepping through, everything behaves as expected.

An explanation would be appreciated.

cheppsn
  • 190
  • 13

3 Answers3

2
  1. The first problem is that you are setting the Runnable workToDo from the main thread and then reading it in the 2 forked threads without synchronization. Any time you modify a field in multiple threads, it should be marked as volatile or someone synchronized.

    private volatile Runnable workToDo;
    
  2. Also, because multiple threads are doing counter++ this also needs to be synchronized. I recommend an AtomicInteger for that.

    private AtomicInteger counter = new AtomicInteger(0);
    ...
    threads[counter.incrementAndGet() % 2].setWork(this);
    
  3. But I think the real problem may be one of race conditions. It is possible for both threads to set the workToDo to be the Runnable and then have them both return and set it back to be null so they will just spin forever. I'm not sure how to fix that.

    1. threads[0] has it's `workToDo` set to the runnable.  It calls `run()`.
    2. at the same time threads[1] also calls `run()`.
    3. threads[0] sets the `workToDo` on itself and threads[1] to be the runnable.
    4. at the same time threads[1] does the same thing.
    5. threads[0] returns from the `run()` method and sets `workToDo` to be `null`.
    6. threads[1] returns from the `run()` method and sets `workToDo` to be `null`.
    7. They spin forever...
    

And, as you mention, the spin loop is crazy but I assume this is a demonstration thread program.

Gray
  • 115,027
  • 24
  • 293
  • 354
  • 2
    Use of a Object monitor (wait/notify) would also be advisable – MadProgrammer Apr 23 '13 at 21:36
  • Yeah @MadProgrammer. I assume this is some sort of thread demo program or something. – Gray Apr 23 '13 at 21:38
  • The "should" part is clear, as explained above this fixes the problem. But why? Somehow the field contents seem to get lost and both threads are left with a view to a `null` field (hence busy-waiting forever). However, when you set a breakpoint at that time in a WorkerThread's `run()` method, the current thread halts and suddenly sees the `workToDo` again. I'm looking for a more in-depth explanation. – cheppsn Apr 23 '13 at 21:43
  • @MadProgrammer: That's understood. But it's not the issue here. This is not about fixing the program, it's about getting an explanation of the observed behavior. – cheppsn Apr 23 '13 at 21:45
  • I've edited my answer to show how both threads can end up with null `workToDo` fields and spin forever @user2119598. – Gray Apr 23 '13 at 21:46
  • I stand corrected abut my comment above: It's rubbish, of course setting a breakpoint does not magically get things into order again once both Threads have their `workToDo` set to `null`. I think __torquestomp__ is absolutely right with his answer, it explains clearly what is happening. – cheppsn Apr 23 '13 at 21:54
  • I had that point before he did but whatever. Glad it's working for you @user2119598. You _will_ need to make sure of the synchronization points I mention in my answer as well. – Gray Apr 23 '13 at 21:57
  • @Gray: I marked the _best_ answer. **torquestomp**'s answer was just better understandable than yours in my opinion, sorry. – cheppsn Apr 23 '13 at 22:06
1

The problem occurs right between these lines:

workToDo.run();
workToDo = null;

Suppose the following sequence of events occurs:

- Original Runnable runs.  "Ping!".setWork() called
- Ping! thread realizes workToDo != null, calls run(), the stops between those two lines
  - "Pong!".setWork() called
- Pong! thread realizes workToDo != null, calls run()
  - "Ping!".setWork() called
- Ping! thread resumes, sets workToDo = null, ignorantly discarding the new value
- Both threads now have workToDo = null, and the counter is frozen at 2,...,80
  Program hangs
torquestomp
  • 3,304
  • 19
  • 26
  • I think there is also be a cache-coherency issue here too. Making `workToDo` a `volatile` or making `setWork` to `synchronized` will fix the cache-coherency by forcing the value of `workToDo` to be visible to all threads. However, it won't solve the race condition you have identified, that would require a lock or other construct to prevent another thread from calling `setWork()` while this thread is between `workToDo.run()` and `workToDo = null`. – Mike Tunnicliffe Apr 23 '13 at 22:01
0

my 2 cents....

import java.util.concurrent.atomic.AtomicReference;

class WorkerThread extends Thread {
    private AtomicReference<Runnable> work;

    public WorkerThread(String name) {
        super(name);
        work = new AtomicReference<Runnable>();
    }

    @Override
    public void run() {
        while (true){
            Runnable workToDo = work.getAndSet(null);
            if ( workToDo != null ) {
                workToDo.run();
            }
        }
    }

    public void setWork(Runnable newWork) {
        this.work.set(newWork);
    }
}
Kevin
  • 431
  • 4
  • 11
  • This solves the race condition, there is still potential for setWork to be called before the work item has been serviced. work should probably be a queue, or setWork should block until work is cleared before allowing it to be set. – Kevin Aug 14 '14 at 18:40