My scenario is as follows. I have a data table with a million rows of tuples (say first name and last name), and a client that needs to retrieve a small subset of rows whose first name or last name begins with the query string. Caching this seems like a catch-22, because:
- On the one hand, I can't store and retrieve the entire data set on every request (would overwhelm the network)
- On the other hand, I can't just store each row individually, because then I'd have no way to run a query.
- Storing ranges of values in the cache, with a local "index" or directory would work... except that, you'd have to essentially duplicate the data for each index, which defeats the purpose of even using a distributed cache.
What approach is advisable for this kind of thing? Is it possible to get the benefits of using a distributed cache, or is it simply not feasible for this kind of scenario?