0

I am using Guava LoadingCache to populate some data into it and I want to remove all the entries from that LoadingCache every 1 minute.

public class MetricHolder {
  private final ExecutorService executor = Executors.newFixedThreadPool(2);
  private final LoadingCache<String, AtomicLongMap<String>> clientIdMetricCounterCache =
      CacheBuilder.newBuilder().expireAfterWrite(1, TimeUnit.MINUTES)
          .removalListener(RemovalListeners.asynchronous(new SendToDatabase(), executor))
          .build(new CacheLoader<String, AtomicLongMap<String>>() {
            @Override
            public AtomicLongMap<String> load(String key) throws Exception {
              return AtomicLongMap.create();
            }
          });

  private static class Holder {
    private static final MetricHolder INSTANCE = new MetricHolder();
  }

  public static MetricHolder getInstance() {
    return Holder.INSTANCE;
  }

  private MetricHolder() {}

  public void increment(String clientId, String name) throws ExecutionException {
    clientIdMetricCounterCache.get(clientId).incrementAndGet(name);
  }

  public LoadingCache<String, AtomicLongMap<String>> getClientIdMetricCounterCache() {
    return clientIdMetricCounterCache;
  }

  private static class SendToDatabase implements RemovalListener<String, AtomicLongMap<String>> {
    @Override
    public void onRemoval(RemovalNotification<String, AtomicLongMap<String>> notification) {
      String key = notification.getKey();
      AtomicLongMap<String> value = notification.getValue();
      System.out.println(key);
      System.out.println(value);
      // sending these key/value to some other system

    }
  }
}

I am calling increment method from lot of different places in the code in a multithreaded way. So for a period of 1 minute it will populate lot of metrics in clientIdMetricCounterCache. Now I want to drop all those metrics reliably after every 1 minute and send all those metrics to database.

In my case, sometimes write to increment method might be very slow but still I want to drop all those entries every 1 minute and I am not doing any read at all on this cache, just writing to it and then dropping those records by sending to some other system. Below is what I saw in the Guava wiki

Caches built with CacheBuilder do not perform cleanup and evict values "automatically," or instantly after a value expires, or anything of the sort. Instead, it performs small amounts of maintenance during write operations, or during occasional read operations if writes are rare.

So how does expireAfterWrite works? Does it work like a scheduler which will run every 1 minute and delete all the entries whatever is there in clientIdMetricCounterCache and then again it will wake up after 1 minute and delete all the entries from the same cache and keep going like that? After reading the wiki, I doubt it works like that. If it doesn't, then how can I reliably drop those records every 1 minute and send to some other system as my writes can be rare for some time?

Looks like I may have to use Guava TimeLimiter interface and SimpleTimeLimiter or may be ScheduledExecutorService to reliably timeout the call and then drop the entries? If yes, can anyone provide an example how will this work in my current example?

john
  • 11,311
  • 40
  • 131
  • 251
  • I don't see how `TimeLimiter` has to do with anything, but the documentation you quoted clearly answers the question: create a background thread that calls `Cache.cleanUp()` at regular intervals. – shmosel Dec 13 '16 at 04:13
  • It's worth noting that since you're *modifying* the cached values rather than replacing them, you may lose a concurrent update in your removal listener if another thread has a reference to the `AtomicLongMap` you're trying to clean. – shmosel Dec 13 '16 at 04:17
  • @shmosel Yeah I saw that in the wiki. The confusion I have is what I am supposed to do in my background thread? Just call `Cache.cleanUp()` meaning `clientIdMetricCounterCache.cleanUp()` at regular intervals? If you can provide an example, then this will clear my confusion. – john Dec 13 '16 at 05:25
  • @shmosel Where do you see that that I am modifying the values instead of replacing them? – john Dec 13 '16 at 05:27
  • Every time you call `incrementAndGet()`. – shmosel Dec 13 '16 at 05:30
  • So let's say I am calling `increment` method 100 times with `clientId` as `abc` and `name` as `processA`. Then my cache will have `abc` string as the key and its value will be `AtomicLongMap` which will have `processA` as `100` times. Right? This is what my increment method doing. So you are saying that is not the right way as it can lose some concurrent update? – john Dec 13 '16 at 05:36
  • Right, because another can get the reference before the cleanup and update afterwards. – shmosel Dec 13 '16 at 06:03

1 Answers1

2

To me, it looks like you're misusing the cache, where a Map would do. You're using no expiration, no size limit, no caching, you're just gathering stats.

About the only feature you're using is the loading aspect, and that's not really worth it.

I'd suggest to use an AtomicReference<ConcurrentHashMap<String, AtomicLongMap>> instead:

  • When updating, you get the version for the current minute via AtomicReference::get.
  • Using the clientId, you look up an AtomicLongMap in your ConcurrentHashMap and create a new one if not found (use putIfAbsent on Java 7 or computeIfAbsent on Java 8).
  • Using the name, you update the AtomicLongMap just like you posted.
  • Once per minute you replace everything via AtomicReference::getAndSet.

With the replacement, you can be sure that your stats don't interfere, however, you should wait a bit after getAndSet as there may be threads who just obtained the reference and are about to write.

It will produce more garbage than the original approach, but all the garbage will be short living, so you might actually make the GC more happy.

It's simple and need no deep knowledge of a library or its implementation details.


I guess, volatile instead of AtomicReference would also do.

maaartinus
  • 44,714
  • 32
  • 161
  • 320
  • Do I have to use `AtomicReference` or `volatile` here? If I just have this and populate into this map from multiple threads, then will there be any problem? `ConcurrentMap> clientIdMetricCounterCache = Maps .newConcurrentMap();` – john Dec 15 '16 at 04:57
  • @david My idea was to replace the map by a new one once per minute. This idea is orthogonal to the part of using a map instead of a cache. So sure, it can work without replacing the map, it just may the periodic operation more complicated (as it probably needs to remove entries while normal operations add and use them). – maaartinus Dec 15 '16 at 11:00