I don't have experience scaling-out SQL back-end, but from what I've read so far, sharding for writes and caching for reads seem to be two of the most common practices. I'm trying to learn how eventual consistency could be minimized with the right caching strategy.
I'd like to use Azure SQL Database, Entity Framework & Elastic Scale middleware, and Redis for testing purposes.
Is there a way to commit a distributed transaction to both SQL Server and Redis?
If not, what's an effective way to ensure read freshness when a database change occurs?
I could write to SQL and update cache in the same API, but writing to cache might fail for whatever reason. I could implement a retry logic, but assuming all attempts fail, I could try to rollback the SQL transaction or simply serve old cache data to clients and periodically rebuild cache to catch up with the database. Of course, the latter would mean data reads are not consistent for a period of time. Evicting data and reading from the SQL cluster is another option, but cross-shard queries might be very expensive, especially when they involve complex joins and you have hundreds, if not thousands, of databases on commodity hardware.