2

Hi Guys so i got this piece of code

public class Padding {

  static class Pair {

        volatile long c1;
        // UN-comment this line and see how perofmance is boosted * 2
      //  long q1; //Magic dodo thingy

        volatile long c2;

  }

  static Pair p = new Pair();

  static class Worker implements Runnable {

        private static final int INT = Integer.MAX_VALUE/8;
        private boolean b;
        Worker(boolean b) {
              this.b = b;
        }

        public void run() {
              long start = System.currentTimeMillis();
              if (b) {
                    for (int i = 0; i < INT; i++) {
                          p.c1++;
                          res += Math.random();
                    }
              } else {
                    for (int i = 0; i < INT; i++) {
                          p.c2++;
                          res += Math.random();
                    }
              }
              long end = System.currentTimeMillis();
              System.out.println("took: " + (end-start) + " Result:" + p.c1+p.c2);
        }

  }


  public static void main(String[] args) {
        System.out.println("Starting....");
        Thread t1 = new Thread(new Worker(true));
        Thread t2 = new Thread(new Worker(false));

        t1.start();
        t2.start();


  }

}

So if i run it takes about 11 seconds but if i uncomment qa1 it runs in 3 second .I tried to find some thing on internet but nothing informative enough came up. As i understand it has some thing to do with JVM optimiztion and long q1 probably makes memory (or cache) distribution some how better . Any way my question is does some one knows where can i read about it more . Thanks

Jose Luis
  • 3,307
  • 3
  • 36
  • 53
urag
  • 1,228
  • 9
  • 28

1 Answers1

10

Performance in your example is degradated by false sharing - c1 and c2 instances are placed in the same cache line and threads need to flush/load values to/from main memory at every increment of different fields to maintain cache coherency after mututal cache line copy invalidation.

In your case it is enough to declare one more q1 long field right after c1 to make c2 go to another cache line (which size is only 64 bytes for the x86 family). After it cache management becomes way more efficient - threads can use different cache lines and do not invalidate copy of the other thread's cache line.

There are many articles which are devoted to the hardware nature of this issue (and software ways of avoiding it). Dealing with false sharing by "footprint padding" solution (like yours) has been being tricky for a long time - Java platform doesn't guarantee that fields order and cache line padding in runtime would be exactly as your expect in class declaration. Even "minor" platfrom update or switch to another VM implementation can brake a solution (because fields - especially unused dummies - are subjects of optimizations).

That's whу JEP-142 was introduced and @Contended annotation was implemented in Java 8. This annotation allows you to configure which fields of the class should be placed on different cache lines. But now it's just a VM hint without any absolute guarantee about false sharing avoidance in all situations, so you should look at your code carefully and verify its behaviour (if your application is sensitive to the false sharing issue, of course)

Cootri
  • 3,806
  • 20
  • 30