Is this Redis Race Condition Scenario Possible?

Question

I'm debugging an issue in an application and I'm running into a scneario where I'm out of ideas, but I suspect a race condition might be in play.

Essentially, I have two API routes - let's call them A and B. Route A generates some data and Route B is used to poll for that data.

Route A first creates an entry in the redis cache under a given key, then starts a background process to generate some data. The route immediately returns a polling ID to the caller, while the background data thread continues to run. When the background data is fully generated, we write it to the cache using the same cache key. Essentially, an overwrite.

Route B is a polling route. We simply query the cache using that same cache key - we expect one of 3 scenarios in this case:

The object is in the cache but contains no data - this indicates that the data is still being generated by the background thread and isn't ready yet.
The object is in the cache and contains data - this means that the process has finished and we can return the result.
The object is not in the cache - we assume that this means you are trying to poll for an ID that never existed in the first place.

For the most part, this works as intended. However, every now and then we see scenario 3 being hit, where an error is being thrown because the object wasn't in the cache. Because we add the placeholder object to the cache before the creation route ever returns, we should be able to safely assume this scenario is impossible. But that's clearly not the case.

Is it possible that there is some delay between when a Redis write operation returns and when the data is actually available for querying? That is, is it possible that even though the call to add the cache entry has completed but the data would briefly not be returned by queries? It seems the be the only thing that can explain the behavior we are seeing.

If that is a possibility, how can I avoid this scenario? Is there some way to force Redis to wait until the data is available for query before returning?

score 2 · Accepted Answer · answered Nov 11 '22 at 06:34

Is it possible that there is some delay between when a Redis write operation returns and when the data is actually available for querying?

Yes and it may depend on your Redis topology and on your network configuration. Only standalone Redis servers provides strong consistency, albeit with some considerations - see below.

Redis replication

While using replication in Redis, the writes which happen in a master need some time to propagate to its replica(s) and the whole process is asynchronous. Your client may happen to issue read-only commands to replicas, a common approach used to distribute the load among the available nodes of your topology. If that is the case, you may want to lower the chance of an inconsistent read by:

directing your read queries to the master node; and/or,
issuing a WAIT command right after the write operation, and ensure all the replicas acknowledged it: while the replication process would happen to be synchronous from the client standpoint, this option should be used only if absolutely needed because of its bad performance.

There would still be the (tiny) possibility of an inconsistent read if, during a failover, the replication process promotes a replica which did not receive the write operation.

Standalone Redis server

With a standalone Redis server, there is no need to synchronize data with replicas and, on top of that, your read-only commands would be always handled by the same server which processed the write commands. This is the only strongly consistent option, provided you are also persisting your data accordingly: in fact, you may end up having a server restart between your write and read operations.

Persistence

Redis supports several different persistence options; in your scenario, you may want to configure your server so that it

logs to disk every write operation (AOF) and
fsync every query.

Of course, every configuration setting is a trade off between performance and durability.

A very thorough answer, thanks. – Bassinator Nov 11 '22 at 14:21 — Bassinator, Nov 11 '22 at 14:21

Is this Redis Race Condition Scenario Possible?

1 Answers1

Redis replication

Standalone Redis server

Persistence