0

I wonder if a DHT algorithm like Kademlia is good at handling for my specific use case.

I want to have a service that can maintain a large amounts of kv (string, int) pairs, within these kv pairs I also want the ability to do computation like how many increments in the past minute, hour, day, etc.

I am thinking about how to handle and scale upwards to millions of request to this "system" in both reads and writes. I have a high requirement for low latency and data accuracy, fault tolerance is also important so that's why I'm thinking of DHT algos.

If I were to rank my top priority, it would be

  1. Latency
  2. Accuracy
  3. Fault tolerance

(Ignoring scalability here due to the nature of distributed systems)

Are there better approaches to this problem? Or would DHTs be the best way to go for a distributed system like this.

Edit: In relation to the CAP theorem, I would say it would be a CA system.

Baiqing
  • 1,223
  • 2
  • 9
  • 21
  • Is it globally distributed (does speed of light latency matter)? Would the system be a source of truth or merely a copy of counts counted somewhere else (would it have to implement a distributed increment operation)? Can you restate your requirements in terms of the CAP theorem? – the8472 Aug 19 '23 at 12:51
  • Yes, the service is globally distributed, I don't need to be accurate down to the speed of light latency, keeping accuracy within the seconds level is good enough. All services needs to be able to handle read and write operations, and they can be deployed in any location to better serve edge use cases. – Baiqing Aug 19 '23 at 19:59
  • How does this "counting" action work, exactly? Who initiates it and how are you passing messages around between nodes? (and what made you jump to using a DHT as a solution - have you considered more conventional approaches (Hadoop/MapReduce?) - Are you sure a traditional (i.e. non-distributed) system can't scale to handle this? ("millions of requests" per second, hour, day?) (even if it's per-hour, a single VM should be able to handle millions of increment ops). – Dai Aug 19 '23 at 20:06
  • Let's say there are datacenters A,B and C. Any service deployed in those datacenters should be able to initiate any action, like creating a count, incrementing and decrementing. Messages are passed on a per need basis, i.e. counter a needs to be synced in datacenter A and B so datacenter C would not have that number. Millions of requests per second is the likely situation. – Baiqing Aug 20 '23 at 22:09

0 Answers0