Is it possible to compute an approximate size estimate of a kademlia network from a node's k-buckets?

Question

Assuming that nodeids are evenly distributed, would it be possible to calculate an estimated number of nodes based on the k-bucket cache?

The reason I want this is that I want to create a kademlia network based on mainline DHT with BEP42 added (https://www.bittorrent.org/beps/bep_0042.html) that stores data with some level of trust that a trustworthy is actually providing it, and not a malicious actor who has an interest in altering the value for a given infohash key.

I want to use the estimated number of nodes to determine how much I can trust the answer a node gives me. So if a node gets a reply from a peer, then by using on the distance of the nodeid of the peer and the infohash requested, and the size of the network, I would calculate a trust score.

I'm assuming I could multiply the size of the k-buckets in each layer to get an estimate. For example, in the following diagram, https://docs.google.com/presentation/d/11qGZlPWu6vEAhA7p3qsQaQtWH7KofEC9dMeBFZ1gYeA/edit#slide=id.g1718cc2bc_01994

the total estimate would be, (by going bottom up): (3+2)(4+1)(4+1)*(4+1) = 625

score 1 · Accepted Answer · answered Oct 13 '22 at 21:23

1

Yes, it's possible to get an estimate this way. But it can be a fairly rough estimate. A better approach is to use the k-closest-node set of random find_node lookups which you'll have to do as part of routing table maintenance anyway. They provide more samples to calculate those estimates.

If you're designing a new DHT from scratch rather than using the bittorrent DHT then any node-density based security measure is pretty weak since someone could always buy some capacity on a botnet to attack a specific key region or something like that. You should consider whether it's possible to use cryptography to secure whatever you want to secure.

answered Oct 13 '22 at 21:23

the8472

40,999
5
70
122

The DHT network that I'm trying to design is for a url associated comment system, like the old Gab Dissenter plugin, except it will be decentralized. By hiring a botnet to attack, I suppose you are talking about a DoS attack against a particular set of nodes. Once they leave, those nodes should come up again, so I don't think that's a problem. The big problem is that user metadata is also added to the network, and I want to make sure that a user can't control his own metadata. Hope that makes sense, I don't have a lot of characters to explain in this response. – redfish64 Oct 15 '22 at 06:02
1

You might want to explain your actual goal (rather than what you think the solution is) in a question. And no, I wasn't talking about a DoS attack on the target not but farming out to so many IP addresses that one can find some sitting at the right prefix. If the DHT has 1M nodes and you have 1M ip addresses to chose from then you can find several that has the prefix you want – the8472 Oct 15 '22 at 19:29
Point taken. I'm not worried about this kind of attack , because the stakes aren't that high. The user metadata simply contains what amounts to a karma score. It's ok if a few users can affect their own score, as long as most won't. The only reason I have a karma score at all is so to incentivize people to run their own daemon. If they do so they will get a boost in their score (in general people will connect through a browser extension using webrtc and websockets, but some daemons need to exist for the network to work). Your comment makes it sound like you may have an alternative solution? – redfish64 Oct 16 '22 at 03:35
1

I don't think this really fits in a comment, but roughly it could be done by node identity where each value must be signed by an originator and you can check with the originator if they really published it. And to prevent collusion attacks one could have a trust network. Trust in decentralized environments is difficult, the solution needs to be tailored to what's available. Basing on IP addresses is kind of crude, may be enough, may be not, depending on what resources attackers have and how big the network is. Smaller networks are easier to attack. – the8472 Oct 17 '22 at 15:41
Yes, I was planning on nodes signing messages. A decentralized trust network seems like a tricky thing and easy to abuse. The main problem can be boiled down to this: Consider nodes A, B, and C. A did a bad thing. B has a cryptographic proof of this. Where can C look in the p2p network to find the proof of what A did without prior knowledge of B? This is what my idea was trying to solve. (ie. take A's pubkey, append ex. 'd1', then hash it and store the proof with the resulting key, Also do this with 'd2', 'd3', etc. to make it even more difficult for A to attack.) – redfish64 Oct 18 '22 at 09:16

Is it possible to compute an approximate size estimate of a kademlia network from a node's k-buckets?

1 Answers1