1

I use Guava Cache to cache my data. The data in the cache will be cleaned if it has not been used for several minutes.

If I modify my data, I will update the data in cache, and mark the data "dirty"(because it is be modified ,and is different with the data in database). And Every 5 minutes I will push the "dirty" data to database(i.e., update the data in database).

The problem is, there is a "dirty" data A. Before data A being pushed to database, the data A has been cleaned first, then I will lose the "dirty" data A.

So, I add a RemovalListener to the Guava Cache when the data has been cleaned, the RemovalListener will notice me and I will a callback function. In the function, I attempt to put the data back to the cache. But in multithreaded environment, it can not promise the data correct.

e.g:

1)cache: clean Data A

2)Thread 1: get Data A, the Data A in cache has been cleaned, so cache will get the Data A from database.And the Data A in database is not newest. So the Thread 1 get a incorrect Data A.

3)cache: run RemovalListener callback.

So, how can I deal with the dirty data, so that I can promise the Data is always correct when in multithread? Thanks!

Defit
  • 113
  • 9
  • Do you really have to defer the database update that way? Is the normal process of commiting to the database (and checking that it went through) too heavy? – Thilo Dec 08 '16 at 07:52
  • @Thilo If I update database frequently, it will Loss perfomance in DB I/O. I just update database every 5 minutes. – Defit Dec 08 '16 at 08:07

1 Answers1

1

A possible solution is to write the dirty data in the RemovalListener. If this is done synchronous other operations on the same entry are blocked, and no inconsistent state becomes visible. Depending on the latency of your database this might effect other operations on the cache as well, see the warning in Guavas documentation.

Generally speaking, what you like to do is a so called "write behind cache". There are cache products that have this functionality build in. Take a look at existing solutions.

cruftex
  • 5,545
  • 2
  • 20
  • 36
  • Thanks! However, the two operation (cache clean the data) and (RemovalListener) is not atomic operation, so in the mutithread, will have so problem like the example I said in Question. – Defit Dec 08 '16 at 12:12
  • I must admit that I am no Guava expert, however, the sense of a synchronous removal listener is to do things that need to be done synchronously with the removal. Why do you think it is not atomic? – cruftex Dec 08 '16 at 12:59
  • @curftex I am sorry I hava a puzzle in the synchronously and atomic. In my opinion, although the removal is synchronously, the two operation (cache clean the data) and (RemovalListener) is not atomic. So I think between the two operation, some thread will do some operation to get the incorrect data from cache. I was puzzled with it. So can you explain it.... – Defit Dec 08 '16 at 13:40
  • The "cache clean" is the expiry of an entry. This triggers the removal listener to be called. The listener is called synchronously, which means the removal operation is not complete until the listener(s) have been run. That means, if you have a loading cache, it is illegal to start a load before the listeners have done their work. If you do not have a loading cache, and populate with `Cache.put`, then it will be racy, of course. – cruftex Dec 08 '16 at 14:32
  • Okay.. I got it. Thanks! – Defit Dec 09 '16 at 01:31