24

This is on Java 7 (51) on RHEL with 24 cores We are noticing a rise in average response times of a java SimpleDateFormat wrapped in thread local as we increase the thread pool size. Is this expected? or, I am just doing something stupid ?

enter image description here

Test program

    public class DateFormatterLoadTest {
        private static final Logger LOG = Logger.getLogger(DateFormatterLoadTest .class);
        private final static int CONCURRENCY = 10;

        public static void main(String[] args) throws Exception {
            final AtomicLong total = new AtomicLong(0);
            ExecutorService es = Executors.newFixedThreadPool(CONCURRENCY);
            final CountDownLatch cdl = new CountDownLatch(CONCURRENCY);
            for (int i = 0; i < CONCURRENCY; i++) {
                es.execute(new Runnable() {
                    @Override
                    public void run() {
                        try {
                            int size = 65000;
                            Date d = new Date();

                            long time = System.currentTimeMillis();
                            for (int i = 0; i < size; i++) {
                                String sd = ISODateFormatter.convertDateToString(d);
                                assert (sd != null);
                            }
                            total.addAndGet((System.currentTimeMillis() - time));

                        } catch (Throwable t) {
                            t.printStackTrace();
                        } finally {
                            cdl.countDown();
                        }
                    }
                });
            }
            cdl.await();
            es.shutdown();
            LOG.info("TOTAL TIME:" + total.get());
            LOG.info("AVERAGE TIME:" + (total.get() / CONCURRENCY));
        }
    }

DateFormatter class:

public class ISODateFormatter {
    private static final Logger LOG = Logger.getLogger(ISODateFormatter.class);

    private static ThreadLocal<DateFormat> dfWithTZ = new ThreadLocal<DateFormat>() {
        @Override
        public DateFormat get() {
            return super.get();
        }

        @Override
        protected DateFormat initialValue() {
            return new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ",
                    Locale.ENGLISH);
        }

        @Override
        public void remove() {
            super.remove();
        }

        @Override
        public void set(DateFormat value) {
            super.set(value);
        }

    };

    public static String convertDateToString(Date date) {
        if (date == null) {
            return null;
        }
        try {
            return dfWithTZ.get().format(date);
        } catch (Exception e) {
            LOG.error("!!! Error parsing dateString: " + date, e);
            return null;
        }
    }
}

Someone suggested to take out the AtomicLong so just wanted to share that it is not playing any role in increasing the average time:

##NOT USING ATOMIC LONG##
2014-02-28 11:03:52,790 [pool-1-thread-1] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:328
2014-02-28 11:03:52,868 [pool-1-thread-6] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:406
2014-02-28 11:03:52,821 [pool-1-thread-2] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:359
2014-02-28 11:03:52,821 [pool-1-thread-8] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:359
2014-02-28 11:03:52,868 [pool-1-thread-4] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:406
2014-02-28 11:03:52,915 [pool-1-thread-5] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:453
2014-02-28 11:03:52,930 [pool-1-thread-7] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:468
2014-02-28 11:03:52,930 [pool-1-thread-3] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:468
2014-02-28 11:03:52,930 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - CONCURRENCY:8

##USING ATOMIC LONG##
2014-02-28 11:02:53,852 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - TOTAL TIME:2726
2014-02-28 11:02:53,852 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - CONCURRENCY:8
2014-02-28 11:02:53,852 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - AVERAGE TIME:340

##NOT USING ATOMIC LONG##
2014-02-28 11:06:57,980 [pool-1-thread-3] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:312
2014-02-28 11:06:58,339 [pool-1-thread-8] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:671
2014-02-28 11:06:58,339 [pool-1-thread-4] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:671
2014-02-28 11:06:58,307 [pool-1-thread-7] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:639
2014-02-28 11:06:58,261 [pool-1-thread-6] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:593
2014-02-28 11:06:58,105 [pool-1-thread-15] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:437
2014-02-28 11:06:58,089 [pool-1-thread-13] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:421
2014-02-28 11:06:58,073 [pool-1-thread-1] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:405
2014-02-28 11:06:58,073 [pool-1-thread-12] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:405
2014-02-28 11:06:58,042 [pool-1-thread-14] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:374
2014-02-28 11:06:57,995 [pool-1-thread-2] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:327
2014-02-28 11:06:57,995 [pool-1-thread-16] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:327
2014-02-28 11:06:58,385 [pool-1-thread-10] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:717
2014-02-28 11:06:58,385 [pool-1-thread-11] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:717
2014-02-28 11:06:58,417 [pool-1-thread-9] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:749
2014-02-28 11:06:58,418 [pool-1-thread-5] INFO  net.ahm.graph.DateFormatterLoadTest  - THREAD TIME:750
2014-02-28 11:06:58,418 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - CONCURRENCY:16

##USING ATOMIC LONG##
2014-02-28 11:07:57,510 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - TOTAL TIME:9365
2014-02-28 11:07:57,510 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - CONCURRENCY:16
2014-02-28 11:07:57,510 [main] INFO  net.ahm.graph.DateFormatterLoadTest  - AVERAGE TIME:585
Debajyoti Roy
  • 985
  • 2
  • 12
  • 34
  • Instance of the local thread return this instance to perform operations on data if this is not an empty instance or initialize a new instance on demand. –  Feb 28 '14 at 15:20
  • Can you add SimpleDateFormat sdf as a private static variable to ISODateFormatter and initialize it in the static block? Then you would return the same instance every time initialValue() is being called. – Boris Feb 28 '14 at 15:24
  • @user3301492: SimpleDateFormat is not threadsafe, so this would be a bad idea. – beny23 Feb 28 '14 at 15:25
  • 2
    Since you've gone to the trouble of building a harness, you might as well benchmark [FastDateFormat](https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/time/FastDateFormat.html) and [DateTimeFormatter](http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormatter.html) – lance-java Feb 28 '14 at 15:25
  • 1
    @LanceJava: user3301492 was suggesting to have a single private static SimpleDateFormatter and returning that *same* instance in the initialValue. This would mean that the *same* instance would be used in each thread, which would mean SimpleDateFormatter would *no longer* be used in a thread-safe manner. – beny23 Feb 28 '14 at 15:33
  • Each thread if you notice formats 65,000 times using the same SDF which is local to it. I can't explain why that time goes up as other threads are going about their local business :) I am not starving the CPU at all as per TOP. – Debajyoti Roy Feb 28 '14 at 15:41
  • @DebajyotiRoy For each thread can you get their average wait time independently and display per thread instead of a shared AtomicLong? – John Vint Feb 28 '14 at 15:56
  • tried that, the average times start to go up regardless - updated in the original question – Debajyoti Roy Feb 28 '14 at 16:10
  • Why are you overriding `get`, `remove`, and `set` if all you're doing in the body is calling the parent method? – dimo414 Feb 28 '14 at 16:12
  • 4
    You have 24 cores, and you go up to 64 threads in the pool. Since you have more threads than available cores, the threads start to compete with each other, and context-switching on each core starts to appear. It looks quite normal to me that this context-switching increases your response time. – JB Nizet Feb 28 '14 at 16:32
  • I was also hoping this is just because of CPU Starving, just that TOP is not showing me heavy CPU usage – Debajyoti Roy Feb 28 '14 at 16:33

4 Answers4

8

Creating an instance of SimpleDateFormat is very expensive (this article shows some profiling/benchmarking). If this is true, compared with the parsing of the dates into strings, then it follows that as you increase the number of threads (and therefore the number of SimpleDateFormat instances as they are threadlocals) your average time is going to increase.

Martin Wilson
  • 3,386
  • 1
  • 24
  • 29
7

SimpleDateFormat Not Thread-Safe

As the correct answer by Martin Wilson states, instantiating a SimpleDateFormat is relatively expensive.

Knowing that your first thought might be, "Well, let's cache an instance for re-use.". Nice thought, but beware: The SimpleDateFormat class in not thread-safe. So says the class documentation under its Synchronization heading.

Joda-Time

A better solution is to avoid the notoriously troublesome (and now outmoded) java.util.Date, .Calendar, and SimpleDateFormat classes. Instead use either:

  • Joda-Time
    third-party open-source library, popular replacement for Date/Calendar.
  • java.time package
    New, bundled in Java 8, supplanting the old Date/Calendar classes, inspired by Joda-Time, defined by JSR 310.

Joda-Time is intentionally built to be thread-safe, largely through use of immutable objects. There are some mutable classes, but those are not usually used.

This other question on StackOverflow explains that the DateTimeFormatter class is indeed thread-safe. So you can create one instance, cache it, and let all your threads use that formatter without adding any extra synchronization or other concurrency controls.

harbun
  • 515
  • 6
  • 23
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • 2
    The OP is caching a SimpleDateFormat in a ThreadLocal. Each thread thus has its own SimpleDateFormat instance, not shared with other threads. That's thus a good way to reuse a SimpleDateFormat, without risking thread-safety issues. – JB Nizet Feb 28 '14 at 23:07
7

Another approach to speed up your formatting is to cache the formatted result. This considers the fact, that there are usually not so many different dates to format. If you split the formatting of date and time, it is even a better candidate for caching.

The downside of this is, that normal Java cache implementations, like EHCache, are to slow, the cache access just takes longer then the formatting.

There is another cache implementation around that has access times on par with a HashMap. In this case you get a nice speed up. Here you find my proof of concept tests: https://github.com/headissue/cache2k-benchmark/blob/master/zoo/src/test/java/org/cache2k/benchmark/DateFormattingBenchmark.java

Maybe this can be a solution within your scenario.

Disclaimer: I am working on cache2k....

cruftex
  • 5,545
  • 2
  • 20
  • 36
  • the first URL is not valid anymore. – Julien Kronegg Jan 26 '21 at 10:52
  • 1
    Concerning the broken link reported by @JulienKronegg, here is a possible replacement: https://www.programcreek.com/java-api-examples/?code=cache2k%2Fcache2k-benchmark%2Fcache2k-benchmark-master%2Ftestbed%2Fsrc%2Ftest%2Fjava%2Forg%2Fcache2k%2Fbenchmark%2FDateFormattingBenchmark.java# – Stephan Feb 24 '21 at 06:51
1

Our usecase was write once(single thread) and read many times(concurrently). So i converted Date to String at the time of storing the data, instead of doing this each time a request needs to be responded.

Debajyoti Roy
  • 985
  • 2
  • 12
  • 34