4

We have N cache-nodes with basic consistent-hashing in a ring.

Questions:

  1. Is data-structure of this ring stored:
    • On each of these nodes?
    • Partly on each node with its ranges?
    • On a separate machine as a load balancer?

  2. What happens to the ring when other nodes join it?

Thanks a lot.

Daniel Compton
  • 13,878
  • 4
  • 40
  • 60
Ivan Voroshilin
  • 5,233
  • 3
  • 32
  • 61

1 Answers1

6

I have found an answer to the question No 1.

Answer 1: All the approaches are written in my blog: http://ivoroshilin.com/2013/07/15/distributed-caching-under-consistent-hashing/

There are a few options on where to keep ring’s data structure:

  • Central point of coordination: A dedicated machine keeps a ring and works as a central load-balancer which routes request to appropriate nodes. Pros: Very simple implementation. This would be a good fit for not a dynamic system having small number of nodes and/or data. Cons: A big drawback of this approach is scalability and reliability. Stable distributed systems don’t have a single poing of failure.

  • No central point of coordination – full duplication: Each node keeps a full copy of the ring. Applicable for stable networks. This option is used e.g. in Amazon Dynamo. Pros: Queries are routed in one hop directly to the appropriate cache-server. Cons: Join/Leave of a server from the ring requires notification/amendment of all cache-servers in the ring.

  • No central point of coordination – partial duplication: Each node keeps a partial copy of the ring. This option is direct implementation of CHORD algorithm. In terms of DHT each cache-machine has its predessesor and successor and when receiving a query one checks if it has the key or not. If there’s no such a key on that machine, a mapping function is used to determine which of its neighbors (successor and predessesor) has the least distance to that key. Then it forwards the query to its neighbor thas has the least distance. The process continues until a current cache-machine finds the key and sends it back. Pros: For highly dynamic changes the previous option is not a fit due to heavy overhead of gossiping among nodes. Thus this option is the choice in this case. Cons: No direct routing of messages. The complexity of routing a message to the destination node in a ring is O(lg N).

Ivan Voroshilin
  • 5,233
  • 3
  • 32
  • 61