Sorry i'm a beginner in load balancing.
In distributed environments we tend more and more to send the treatment (map/reduce) to the data so that the result gets computed locally and then aggregated.
What i'd like to do apply for partionned/distributed data, not replicated. Following the same kind of principle, i'd like to be able to send an user request on the server where the user data is cached.
When using an embedded cache or datagrid to get low response time, when the dataset is large, we tend to avoid replication and use distributed/partitionned caches.
The partitionning algorithm are generally hash-based and permits to have replicas to handle server failures.
So finally, a user data is generally hosted on something like 3 servers (1 primary copy and 2 replicas)
On a local cache misses, the caches are generally able to search for the entry on other cache peers. This works fine but needs a network access. I'd like to have a load balancing strategy that avoid this useless network call.
What i'd like to know: is it possible to have a load balancer that is aware of the partitionning mecanism of the cache so that it always forwards to one of the webservers having a local copy if the data we need?
For exemple, i have a request www.mywebsite.com/user=387 The load balancer will check the 387 userId and know that this user is stored in servers 1, 6 and 12. And thus he can roundrobin to one of them or other strategy.
If there's no generic solution, are there opensource or commercial, software or hardware load balancers that permits to define custom routing strategies?
How much extracting data of a request will slow down the load balancer? What's the cost of extracting an url parameter (like in my exemple with user=387) and following some rules to go to the right webserver, compared to a roundrobin strategy for exemple?
Is there an abstraction library on top of cache vendors so that we can retrieve easily the partitionning data and make it available to the load balancer?
Thanks!