2

From Wikipedia
Kademlia routing tables consist of a list for each bit of the node ID. If a node ID consists of 128 bits, a node will keep 128 such lists.

Given that a keyspace is from 0-2^160 it means maximum nodes can be present in that keyspace are 2^160 and each Node ID is of 160-bit. If k=20 then maximum entries a node can keep in its routing table is 160x20. How a node can keep track of such a huge amount of nodes in its routing table. Shouldn't a node keep entries of only those 20 nodes present in its own k-bucket with bucket size k=20? How it can keep 160 such lists even though that node is itself not on those lists except that it is present on one list with 20 nodes?

I'm using lists and bucket interchangeably, they both are the same.

defalt
  • 242
  • 1
  • 11
  • [This video](http://engineering.bittorrent.com/2013/01/22/bittorrent-tech-talks-dht/) about the Bittorrent mlDHT, may help to understand how the kademlia DHT works. – Encombe Mar 14 '17 at 20:48

1 Answers1

1

The routing table size is asymptotically bounded by O(log₂(n/k)) where n is the actual number of nodes in the network, not the theoretical limit of 2^160 and k is the bucket size, so larger buckets slightly reduce the number of buckets in the routing table.

In practice for the bittorrent ipv4 dht with k=8 and ~7M reachable nodes you get routing table depths of around 19-22 buckets.

And even though theoretical, 160*20 wouldn't be as bad as you think. That's just 3200 IP addresses + a little associated state to keep in memory and send a packet to every now and then. Pacing the pings to one per second means you could still refresh the whole routing table in under an hour.

the8472
  • 40,999
  • 5
  • 70
  • 122
  • Kademlia contacts only `O(log₂(n))` nodes during the search out of a total of **n** nodes in the system, *from [Wikipedia](https://en.wikipedia.org/wiki/Kademlia)*. Why did you write `O(log₂(n/k))`? If i put `n=2^160` and `k=20` then the result doesn't give an integer. Are you sure your asymptotic notation is correct?? – defalt Mar 15 '17 at 14:27
  • "*during the search*" - you asked about the routing table, not lookups. *"If i put n=2^160 and k=20 then the result doesn't give an integer."* - please read up on what Big-O notation actually means. And as I already said, putting 2^160 in there makes no sense. There aren't even that many atoms in the planet. – the8472 Mar 15 '17 at 15:22
  • I get it, `n/k` represents buckets. It would still be correct if there was only **n** but then it would represent no. of nodes there can be in routing table of each node. You represent it as the no. of buckets there can be in one's routing table. In one way or the another they both mean the same, just a different way to represent. – defalt Mar 15 '17 at 15:57
  • I agree with that part, estimated no. of atoms on this planet are 1.33*10^50. Using `2^160` for the worst case is a bit overkill. – defalt Mar 15 '17 at 16:05
  • 1
    I need a citation for *the size of routing table is asymptotically bounded by **`O(log₂(n/k))`***. Do you have any reference where it is given or from where you came to know about it? Anything where `O(log₂(n/k))` is mentioned. I was not able to find its reference. – defalt Apr 24 '17 at 16:03
  • Nope, no reference because people usually don't ask for the number of buckets but for the total number of nodes in the routing table. But it's one of those "as can be trivially shown" things. – the8472 Apr 24 '17 at 19:24