5

I am confused about sharing arrays safely between threads in Java, specifically memory fences and the keyword synchronized.

This Q&A is helpful, but does not answer all of my questions: Java arrays: synchronized + Atomic*, or synchronized suffices?

What follows is sample code to demonstrate the issue. Assume there is a pool of worker threads that populates the SharedTable via method add(...). After all worker threads are done, a final thread reads and saves the data.

Sample code to demonstrate the issue:

public final class SharedTable {

    // Column-oriented data entries
    private final String[] data1Arr;
    private final int[] data2Arr;
    private final long[] data3Arr;
    private final AtomicInteger nextIndex;

    public SharedTable(int size) {
        this.data1Arr = new String[size];
        this.data2Arr = new int[size];
        this.data3Arr = new long[size];
        this.nextIndex = new AtomicInteger(0);
    }

    // Thread-safe: Called by worker threads
    public void addEntry(String data1, int data2, long data3) {
        final int index = nextIndex.getAndIncrement();
        data1Arr[index] = data1;
        data2Arr[index] = data2;
        data3Arr[index] = data3;
    }

    // Not thread-safe: Called by clean-up/joiner/collator thread...
    // after worker threads are complete
    public void save() {
        // Does this induce a full memory fence to ensure thread-safe reading of 
        synchronized (this) {
            final int usedSide = nextIndex.get();
            for (int i = 0; i < usedSide; ++i) {
                final String data1 = data1Arr[i];
                final int    data2 = data2Arr[i];
                final long   data3 = data3Arr[i];
                // TODO: Save data here
            }
        }
    }
}

The sample code above could also be implemented using Atomic*Array, which acts as an "array of volatile values/references".

public final class SharedTable2 {

    // Column-oriented data entries
    private final AtomicReferenceArray<String> data1Arr;
    private final AtomicIntegerArray  data2Arr;
    private final AtomicLongArray data3Arr;
    private final AtomicInteger nextIndex;

    public SharedTable2(int size) { ... }

    // Thread-safe: Called by worker threads
    public void addEntry(String data1, int data2, long data3) {
        final int index = nextIndex.getAndIncrement();
        data1Arr.set(index, data1);
        ...
    }

    // Not thread-safe: Called by clean-up/joiner/collator thread...
    // after worker threads are complete
    public void save() {
        final int usedSide = nextIndex.get();
        for (int i = 0; i < usedSide; ++i) {
            final String data1 = data1Arr.get(i);
            final int    data2 = data2Arr.get(i);
            final long   data3 = data3Arr.get(i);
            // TODO: Save data here
        }
    }
}
  1. Is SharedTable thread-safe (and cache coherent)?
  2. Is SharedTable (much?) more efficient as only a single memory fence is required, whereas SharedTable2 invokes a memory fence for each call to Atomic*Array.set(...)?

If it helps, I am using Java 8 on 64-bit x86 hardware (Windows and Linux).

Community
  • 1
  • 1
kevinarpe
  • 20,319
  • 26
  • 127
  • 154
  • The comment above the method `public void addEntry(..)` in both `SharedTable` and `SharedTable2` says this method is thread-safe. I presume you might be synchronizing on the `SharedTable` or `SharedTable2` object in the worker threads while calling `public void addEntry(..)` method. Isn't it? – Madhusudana Reddy Sunnapu Dec 31 '15 at 07:41

2 Answers2

4

No, SharedTable is not thread-safe. A happens-before is only guaranteed if you read, from a synchronized block, something that has been written from a synchronized block using the same lock.

Since the writes are made out of a synchronized block, the JMM doesn't guarantee that the writes will be visible by the reader thread.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
  • Very informative! Follow-up question: If I can invoke a full memory fence (via `sun.misc.Unsafe`), instead of using a `synchronized` block, would `SharedTable` be thread-safe? – kevinarpe Dec 31 '15 at 07:59
  • 1
    I wouldn't use Unsafe. I would use an atomic array, which avoids using unsafe, not supposed to be use classes, and makes the intent clear in the code. That should be fast enough. If it isn't and if you've proven that the problem comes from that part, then start optimizing. – JB Nizet Dec 31 '15 at 08:26
  • I'm coming back to this answer because it is so helpful, yet so dense if you are not deeply familiar with the concept of *happens-before* in the Java Memory Model [JMM]. This link helps to understand more: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/package-summary.html#MemoryVisibility – kevinarpe Jan 10 '16 at 09:08
0

Mutating an object that is exchanged between threads, can be done outside of a synchronized block.

Let's first introduce a very practical example. Imagine you have 2 threads; one produces jobs and the other consumes jobs. These threads communicate with each other using a queue. Let's assume a BLockingQueue. Then the producer thread can use simple POJO objects that do not have any internal synchronization and safely exchange these POJOs with the consumer thread. This is exactly how java Executors work. In the documentation, you will find something about the memory consistency effects.

Why does it work?

There needs to be a happens-before edge between writing of the fields of the job and reading the fields of the job.

class Job{int a;}

queue = new SomeBlockingQueue();

thread1:
    job = new Job();
    job.a=1;                    (1)
    queue.put(job);             (2)

thread2:
    job=queue.take();           (3)
    r1=job.a;                   (4)

There is a happens-before edge between (1) and (2) due to program order rule.

There is a happens-before edge between (2) and (3) due to either the monitor lock rule or volatile variable rule.

There is a happens-before edge between (3) and (4) due to the program order rule.

Because the happens-before relation is transitive, there is a happens-before edge between (1) and (4) and hence there is no data race.

So the above code will work fine. But if the producer modifies the Job after it has put it on the queue, then there could be a data race. So you need to make sure your code doesn't suffer from that problem.

pveentjer
  • 10,545
  • 3
  • 23
  • 40