I have an application, in Laravel 10, which allows voting for candidates and shows a list of these candidates sorted by votes. That list of candidates is fetched from database, calculated a bit, and saved to Redis cache with a proper tag. (Several tags, actually: list for users, panel list etc. but it's not relevant here as problem exists with one tag only as well).
After a user voted for a candidate, it flushes all tag data. Another list request fetches all data from database and saves it to Redis cache (tagged). And everything looks OK - cached data is flushed properly, user sees always up to date data. If nobody is voting at the moment, data is fetched from cache making request faster.
The way Laravel 10 handles tagged cache:
Cache::tags(['object-1'])->put('cand-votes', $votes, $TTL);
creates 2 keys inside Redis:
- prefix:tag:tagName:entries holds info about all cache keys that are a part of cache tag. Then it knows what cache keys exactly to flush, when calling flush on entire cache tag.
- prefix:uniqid:cacheKey has actual cached data.
Normal situation (low traffic):
keys app_cache*
1) "app_cache:tag:object-1:entries"
2) "app_cache:6461bc79646c190a1c986d8ca1ab917e8ddf0d8f:cand-votes"
in app_cache:tag:object-1:entries there is:
zscan app_cache:tag:object-1:entries 0
1) "0"
2) 1) "6461bc79646c190a1c986d8ca1ab917e8ddf0d8f:cand-votes"
2) "1689766717"
so it is known what cache keys are inside this cache tag and their TTL. When getting the key:
get app_cache:6461bc79646c190a1c986d8ca1ab917e8ddf0d8f:cand-votes
"150"
value stored is valid.
Problem is occurring when there are around 15 votes per second or more (and thus 15 Redis flush requests per second). At some point tag stored in Redis becomes... broken and it is unable to flush data without flushing entire Redis with arisan cache:clear command.
Invalid behaviour (high traffic, 15 users voted 150 times in 8 seconds):
keys app_cache*
1) "app_cache:6461bc79646c190a1c986d8ca1ab917e8ddf0d8f:cand-votes"
get app_cache:6461bc79646c190a1c986d8ca1ab917e8ddf0d8f:cand-votes
"38"
There is only cache key with data (outdated).
Entries key, with sorted list of cache keys, is gone
Now inside application, calling:
Cache::tags(['object-1'])->get('cand-votes');
always returns wrong data (38 in this case). But flushing the cache with:
Cache::tags(['object-1'])->flush();
is unable to flush that key (because entries key is gone). So I am stuck with invalid data for a TTL period and unable to flush it with another request. Only entire cache purge with artisan cache:clear command helps.
Overall it seems like a cache put command creates recently deleted key with entries key being delete shortly afterwards.
- Tried Cache locks for entire vote adding process, helps a little but slows down waiting for locks and even failing for some requests
- Tried Redis transactions on flushing cache tag data, does not make it any better.
Some workarounds I thought of:
- obviously, disabling cache for these high load moments to have data refreshed real-time
- move flushing cache and some kind of warming cache to a queue to be less concurrent
None of it is ideal and I see it as a "hack".
Is anybody using tagged cache encountered such problem? Looking for any suggestions how to deal with that kind of concurrent situations involving flushing cache.