8

(preliminary note: maybe this is better suited for codereview?)

EDIT Answer to self; I believe this answer covers all my needs/problems, but of course, comments are welcome. Original question left below for reference.

Hello,

Of interest here is the .getSources() method. This method is meant to return a list of message sources for a given Locale.

The two central data structures to this method are sources and failedLookups, see the code for comments.

This particular implementation of .getSources() can only ever return either an empty list or a single-element list, depending on the tryAndLookup() method which prototype is:

protected abstract MessageSource tryAndLookup(final Locale locale)
    throws IOException;

Right now, the logic of the code is as follows:

  • if the message source for that locale has already been successfully looked up, it is returned;
  • from this point on, no lookup has been done; it is however unknown whether this means that a previous lookup attempt has been done: check the set of failed lookups, if the locale to lookup is in there, it is a known failure, return the empty list;
  • now, the known situation is that this locale lookup has actually never been performed: do perform it; depending on the result of the tryAndLookup method, record either a success or a failure.

Now, why I go to such lengths: I have no control over tryAndLookup(); it may take an inordinate amount of time to execute before returning a valid source or failing. As a result, I am reluctant to using a coarse lock or synchronized block.

/**
 * Set of locales known to have failed lookup.
 *
 * <p>When a locale is in this set, it will not attempt to be reloaded.</p>
 */
private final Set<Locale> lookupFailures
    = new CopyOnWriteArraySet<Locale>();

/**
 * Set of message sources successfully looked up
 *
 * <p>When a source is in there, it is there permanently for now.</p>
 */
private final ConcurrentMap<Locale, MessageSource> sources
    = new ConcurrentHashMap<Locale, MessageSource>();

@Override
protected final List<MessageSource> getSources(final Locale locale)
{
    MessageSource source = sources.get(locale);

    /*
     * If found, return it
     */
    if (source != null)
        return Arrays.asList(source);

    /*
     * If it is a registered failure, return the empty list
     */
    if (lookupFailures.contains(locale))
        return Collections.emptyList();

    /*
     * OK, try and look it up. On success, register it in the sources map.
     * On failure, record the failure an return the empty list.
     */
    try {
        source = tryAndLookup(locale);
        sources.putIfAbsent(locale, source);
        // EDIT: fix for bug pinpointed by JBNizet
        // was:
        // return Arrays.asList(source);
        // now is:
        return Arrays.asList(sources.get(locale));
    } catch (IOException ignored) {
        lookupFailures.add(locale);
        return Collections.emptyList();
    }
}

My question here is threefold:

  • I purposefully limit myself to classes only available to the JDK; I chose ConcurrentHashMap as a ConcurrentMap implementation, and CopyOnWriteArraySet as a thread-safe Set implementation; from the javadoc, these are the best I could find. But was I misled somewhere?
  • I think this code is eventually thread safe; some corner cases may lead to lookups being done more than once for instance, but then this is why I do .putIfAbsent(); right now I have always used, and trusted, Guava's LoadingCache for caching purposes, and this is my first foray out of this territory; is this code actually thread safe?
  • this code has a fatal defect: more than one thread may be executing tryAndLookup() at the same time... What solutions exist to have this method executed only once per lookup?
Community
  • 1
  • 1
fge
  • 119,121
  • 33
  • 254
  • 329
  • Given Java 6 is at the end of the line, perhaps it's time to use Java 7. ;) Do you know you need to cache the locales like this? Is this something you lookup millions of times per second only a few thousand times per second? – Peter Lawrey Jun 01 '13 at 21:28
  • 1
    Well, it may be officially at the end of line, but it is still the most widely used version out there :/ As to lookup frequency, it is rather in the order of tens of times a second, hundreds at a maximum... I am just aiming at having bulletproof code eventually ;) – fge Jun 01 '13 at 21:30
  • For a few hundred times per second I would have a plain synchronized block. You will very rarely have two threads accessing it at the same time. I prefer to keep things simple unless you know there is a good reason to make things more complicated. – Peter Lawrey Jun 01 '13 at 21:32
  • As to "Do you know you need to cache the locales like this?", the thing is, if a source for a locale fails to load, I want to minimize the amount of times it is attempted to be loaded again, ideally 0. I don't have control over the implementation of `tryAndLookup()`, it may require a long time to execute. (edited post accordingly) – fge Jun 01 '13 at 21:33
  • Have you timed it? I would be curious to know how long it takes. – Peter Lawrey Jun 01 '13 at 21:37
  • Well, I have not really timed it; I can do it, though, since it's abstract, I can incur arbitrary delays etc; the thing is, the algorithm I implement here should be as solid as possible regardless of the amount of time this (overridable by nature) method takes to complete. – fge Jun 01 '13 at 21:40
  • Instead of caching like this you could attempt every Locale on startup. This would avoid any multi-threaded update concerns and you could use a plain HashMap. – Peter Lawrey Jun 01 '13 at 21:41
  • I was just making the point that you should only optimise things you know need optimising because you measured them. It sounds like it is likely to be a good idea, but you would have to see using a real program. It could be for example that the program never asks for a Locale which is not supported. – Peter Lawrey Jun 01 '13 at 21:43
  • 1
    I agree with Peter regarding keeping things simple, and optimizing only if necessary. But if `tryAndLookup()` is foreign code, not under your control, it might be a good idea to avoid calling it from a synchronized block, because tryAndLookup could try to acquire other locks, and coould lead to a deadlock. – JB Nizet Jun 01 '13 at 21:45
  • @PeterLawrey: "It could be for example that the program never asks for a Locale which is not supported" <-- indeed; the `Locale` constructors can yield locales which are not in the list of what the JVM supports (`ResourceBundle` can also look up such unsupported locales); this is why I purposefully load only the queried locales. @JBNizet: exactly, this was my primary concern when writing this code to begin with... My final goal remains for the code to be as independent as possible from `tryAndLookup()`'s possible mishaps, and thread safe. – fge Jun 01 '13 at 21:53
  • If you want to call tryAndLookup() as little as possible, use a simple synchronized block where you add the key of every Locale you lookup and if the Map has a `null` or dummy MessageSource, you know you have no data. – Peter Lawrey Jun 01 '13 at 21:57
  • 2
    Side note: why do you return a List rather than a MessageSource? Also, shouldn't you return the instance that could already be present in the map (and would be returned by `putIfAbsent()`) instead of returning the newly looked up message source. – JB Nizet Jun 01 '13 at 22:01
  • @JBNizet as to whether I return a `List`, another implementation of the parent class has the ability to return several sources for one locale; this implementation purposefully limits itself to one source per locale. As to the instance returned by `.putIfAbsent()`, that is actually a good point... And that is a bug. Thanks for noticing! – fge Jun 01 '13 at 22:21

3 Answers3

1

As an alternative to the CopyOnWriteArraySet, you could use a ConcurrentHashMap with meaningless values (e.g. always use Boolean.TRUE as the value - you only care about the keys), or else you could use a ConcurrentSkipListSet which is essentially a concurrent TreeSet that uses a skiplist instead of a balanced binary tree.

Assuming that tryAndLookup is fast and doesn't have any side-effects, it doesn't really matter if you occasionally execute it more than once, as as such your "eventually thread-safe" code is thread-safe in the sense that it won't produce any anomalous behavior. (If it's slow then it may be more efficient to use locks in order to ensure that you execute it as few times as possible, but in this case your code would still be free from anomalous behavior. If it has side-effects then you may have a data race if you execute it twice on the same locale.)

Edit Because tryAndLookup may have side-effects, you can either synchronize on locale, or else you can modify your method as follows

private final MessageSource nullSource = new MessageSource(); // Used in place of null in the ConcurrentHashMap, which does not accept null keys or values

protected final List<MessageSource> getSources(final Locale locale) {
...

    try {
        if(sources.putIfAbsent(locale, nullSource) == null) {
            source = tryAndLookup(locale);
            sources.replace(locale, nullSource, source);
            return Arrays.asList(sources.get(locale));
        } else {
            // Another thread is calling tryAndLookup, so wait for it to finish
            while(sources.get(locale) == nullSource && !lookupFailures.contains(locale))
                Thread.sleep(500);
            }
            if(sources.get(locale) != nullSource) {
                return Arrays.asList(sources.get(locale));
            } else {
                return Collections.emptyList();
            }
        }
    } catch (IOException ignored) {
        lookupFailures.add(locale);
        return Collections.emptyList();
    }
}
Zim-Zam O'Pootertoot
  • 17,888
  • 4
  • 41
  • 69
  • BTW You can use `Collections.newSetFromMap(new ConcurrentHashMap())` to get a concurrent hash set. – Peter Lawrey Jun 01 '13 at 21:44
  • "Assuming that tryAndLookup is fast and doesn't have any side-effects" <-- well, I cannot really assume this... This is why I am thinking about `Future` at the moment. – fge Jun 01 '13 at 22:29
  • As to `ConcurrentSkipListSet`, `Locale` does not implement `Comparable` of itself, so... – fge Jun 01 '13 at 22:55
  • @fge See the edit to my response for how you can avoid calling `tryAndLookup` multiple times without using locks. The idea is that you call `putIfAbsent(locale, nullSource)` to prevent a concurrent call to `tryAndLookup(locale)`, then you replace `nullSource` with the actual source upon successfully calling `tryAndLookup`. – Zim-Zam O'Pootertoot Jun 01 '13 at 23:01
  • @Zim-ZamO'Pootertoot studying! Thanks! Maybe I won't need `FutureTask` after all... – fge Jun 01 '13 at 23:16
  • @Zim-ZamO'Pootertoot one thing though: `Thread.sleep()` can throw an `InterruptedException`... Argh. Not sure how to handle this yet. – fge Jun 01 '13 at 23:37
  • @fge it's very rare for a thread to be interrupted - generally speaking, a thread can only be interrupted if you explicitly call `interrupt` on it. As such, you can simply ignore the `InterruptedException` within the while loop, or alternatively you can let the exception propagate to the caller - either alternative is reasonable. – Zim-Zam O'Pootertoot Jun 01 '13 at 23:43
  • @Zim-ZamO'Pootertoot I have tried your code... Looks like the `Thread.sleep()` leads to performance problems in one of my "heavy thread" tests of message lookups, and I don't know why -- which is a cause of concern for me :/ – fge Jun 02 '13 at 00:10
  • Something is probably doing a busy-wait on the results. You'll probably have better luck with a `Future` – Zim-Zam O'Pootertoot Jun 02 '13 at 00:15
1

You can run the first two steps in a synchronized block using simpler data types. If in those steps you find you need to perform the tryAndLookup(), store a Future for the pending result in a separate list of pending lookups before leaving the synchronized block.

Then outside the synchronized block, do the actual lookup. Threads that find they need the same result will find the Future and can get() its result outside the synchronized block.

flup
  • 26,937
  • 7
  • 52
  • 74
0

Answer to self...

The algorithm has been completely revamped. It is based on the JDK's FutureTask, since it has two very nice properties:

  • its .run() method is asynchronous;
  • it is stateful, on success as well as on failure -- meaning, it will return the already computed result or the thrown exception (wrapped into an ExecutionException) on .get().

This has quite an implication on the used data structures:

  • the previously existing lookupFailures set, which recorded failures, has disappeared;
  • the existing map, which was a ConcurrentMap, is now replaced by a plain Map, guarded by a ReentrantLock; its values are now FutureTask<MessageSource> instead of MessageSource.

This has also quite an implication on the algorithm, which is much more simple:

  • lock the map;
  • lookup the task for the locale; if it didn't previously exist, create it, and .run() it;
  • unlock the map;
  • .get() the result: return a single element list on success, an empty list on failure.

Full code, with comments:

@ThreadSafe
public abstract class CachedI18NMessageBundle
    extends I18NMessageBundle
{
    /**
     * Map pairing locales with {@link FutureTask} instances returning message
     * sources
     *
     * <p>There will only ever be one task associated with one locale; we
     * therefore choose to make it a normal map, guarded by a {@link
     * ReentrantLock}.</p>
     *
     * <p>The tasks' {@link FutureTask#run()} method will be executed the first
     * time this object is initialized.</p>
     */
    @GuardedBy("lock")
    private final Map<Locale, FutureTask<MessageSource>> lookups
        = new HashMap<Locale, FutureTask<MessageSource>>();

    /**
     * Lock used to guarantee exclusive access to the {@link #lookups} map
     */
    private final Lock lock = new ReentrantLock();

    @Override
    protected final List<MessageSource> getSources(final Locale locale)
    {
        FutureTask<MessageSource> task;

        /*
         * Grab an exclusive lock to the lookups map. The lock is held only for
         * the time necessary to grab the FutureTask or create it (and run it)
         * if it didn't exist previously.
         *
         * We can do this, since FutureTask's .run() is asynchronous.
         */
        lock.lock();
        try {
            /*
             * Try and see whether there is already a FutureTask associated with
             * this locale.
             */
            task = lookups.get(locale);
            if (task == null) {
                /*
                 * If not, create it and run it.
                 */
                task = lookupTask(locale);
                lookups.put(locale, task);
                task.run();
            }
        } finally {
            lock.unlock();
        }

        /*
         * Try and get the result for this locale; on any failure event (either
         * an IOException thrown by tryAndLookup() or a thread interrupt),
         * return an empty list.
         */
        try {
            return Arrays.asList(task.get());
        } catch (ExecutionException ignored) {
            return Collections.emptyList();
        } catch (InterruptedException  ignored) {
            return Collections.emptyList();
        }
    }

    protected abstract MessageSource tryAndLookup(final Locale locale)
        throws IOException;

    @Override
    public final Builder modify()
    {
        throw new IllegalStateException("cached bundles cannot be modified");
    }

    /**
     * Wraps an invocation of {@link #tryAndLookup(Locale)} into a {@link
     * FutureTask}
     *
     * @param locale the locale to pass as an argument to {@link
     * #tryAndLookup(Locale)}
     * @return a {@link FutureTask}
     */
    private FutureTask<MessageSource> lookupTask(final Locale locale)
    {
        final Callable<MessageSource> callable = new Callable<MessageSource>()
        {
            @Override
            public MessageSource call()
                throws IOException
            {
                return tryAndLookup(locale);
            }
        };

        return new FutureTask<MessageSource>(callable);
    }
}

I could even test the "single task created", on lookup success as well as failure, since tryAndLookup() is abstract, therefore "spyable" using Mockito. Full test class source here (methods onlyOneTaskIsCreatedPer*LocaleLookup()).

fge
  • 119,121
  • 33
  • 254
  • 329
  • 1
    I believe the `run` method will execute the task on the calling thread so you'll need to release the lock before calling it. – flup Jun 02 '13 at 06:51
  • 1
    See also this [Memoizer implementation](http://stackoverflow.com/a/14295737/829571) taken from JCiP. – assylias Jun 02 '13 at 07:17