So I am designing some search functionality where user will search a word over millions of, say tweets.Now i will build index service here that will store some sort of mapping of words to tweets.Now I also want to introduce cache here to store top frequent word results.My doubt is if a word result comes to cache for some word say "abc",now all the request will be served by cache and let us say that word is so trending that it remains in cache for week.Now in one week there will be lot of new tweets also and index mapping might have been update with new tweets.So how can we specify that if result from cache is an older entry then discard it and fetch new results?Obviously i can use write policies of cache but i think it will effect search if we write to cache and db at same time in write through policy.Am i missing some thing here?How can i approach this?
Asked
Active
Viewed 93 times
0
-
There are only two hard things in Computer Science: cache invalidation and naming things. -- Phil Karlton – Pierre Sevrain Jul 21 '20 at 09:30
1 Answers
0
For something like tweets, it's okay if you respond back with search results are are a few minutes/hours old.
Ideally, I would not recommend doing a write through cache, because of the complication it adds, but a low TTL would be a better approach, unless you have some specific use case. Also, since the search system are pretty good these days, it does not hurt to have the same text being searched every new minutes.
With that though I made this Twitter System Design video. Thoughts?
Or you can find a short summary on CodeKarle's Website here.

Sandeep Kaul
- 2,957
- 2
- 20
- 36