Is this data sharing problem an NP problem?

Question

Here is my problem: There are n peers in the P2P network, which request the same data block; And with some constraint. 1. Peers with its own upload bandwidth, and the average bandwidth is the size of the data block. 2. The peers have different deadline about this data block. If one peer didnt get the entire block before the deadline, it has to search for the server help. 3. A peer can transfer data (partial or entire) only if it has the entire data block.

The object is to minimize the server total upload, I cant figure it out if it has an optimal algorithm or it is an NP problem. Deadline first or largest bandwidth first may not deal with some situation Is there some NP problem similar to this? This is like a graph flow problem or an instruction scheduling, but I found that it is difficult cause I have to deal with the deadline and the growth of the suppliers total bandwidth at the same time. I hope that I can get some directions or resource about the solution :) Thanks.

Jérôme Verstrynge · Answer 1 · 2011-04-06T15:59:36.313

Considering that each peer acts individually in your case, it is not like only one automata is solving your issue, but many. Since fetching a data block when it is not available within a given delay, is typically a polynomial problem, and since the job is accomplished by individual peers, your issue is not an NP problem for each peer locally.

On the other side, if a server has to compute the minimal allocation of backup resources to transfer 'missing blocks', you would have to first find out about the probability that a peer misses a block (average + standard deviation for example). Assuming you know the statistical distribution of such events, you could compute the total bandwidth you would need to transfer those missing blocks with a chosen risk of failure/tolerance in the bandwidth. If you are using multiple servers to cover for the need, make sure your peers contact them randomly to distribute the load.

Solving this statistical problem is not an NP issue. You can collect failure info from each peer and add it on a central/server peer. Therefore, my conclusion is that your issue is not an NP problem.

PART II:

Oh, I understand your case better now: multiple 'server' peers can potentially help one peer getting a full block. In this case, the number of server peers grows exponentially in your system for a given block. In this case, this optimization problem has all the characteristic of a flooding problem for me and they are NP.

Even if your graph of peers and the potential connections between them was static (which is never the case in a real P2P network), computing the optimal solution in a reasonable amount of time for more than 50 or 100 nodes is virtually impossible, unless you can make very specific assumptions on this graph (which is almost never the case in general and not always useful).

But do you absolutely need to have the absolute optimal solution or is something near the optimal good enough?

Heuristics will tell you that if your peers have more or less the same download bandwidth capacity, then it makes sense to serve peers with the highest UPLOAD bandwidth first to maximize the avalanche effect and to reduce the risk for a peer having to ask for help, in general.

If your graph is relatively balanced (that is, most peers can connect to most peers), then I bet the minimum bandwidth of initial servers will be a logarithmic function of the number of nodes in your graph times the average speed at which peers expect to be served. This is only my gut feeling and should be validated with real measures or a strong modeling of your case.

Thanks for your response:)! Mine is the second case you mentioned. But I don't understand what you mean the probability of missing a block. Assume the P2P network topology is a complete graph, the peers will be the source after receiving the entire block. Even if I got those information , how can I know this transmission method could minimize the total bandwidth server has to offer? — changefor, Apr 06 '11 at 13:13

Is this data sharing problem an NP problem?

1 Answers1