1

Suppose that edge weights in a graph G are uniformly distributed over [0,1).Which algorithm prims or Kruskals will be faster? I think it will be kruskals as we can take advantage of particular sorting algorithm as sorting is the bottleneck step in kruskals algorithm.

sww
  • 856
  • 1
  • 8
  • 15
  • I don't see how a uniform distribution will help you with the sorting; you still might have distinct values. – G. Bach Jun 09 '13 at 16:34
  • Also you may want to refer here http://stackoverflow.com/questions/1195872/kruskal-vs-prim. The preference really depends on whether the graph is dense as others have pointed out. – Santhosh Jun 09 '13 at 17:54
  • @G.Bach it helps because you can use radix sort to sort in linear time with overwhelmingly high probability of success. See my answer. – xdavidliu Dec 26 '20 at 03:12
  • If you are the one generating the numbers, you can generate them in sorted order rather than sorting. – Dave Dec 27 '20 at 00:50
  • @Dave how do you generate random numbers in sorted order? – G. Bach Dec 27 '20 at 18:28
  • 1
    @G.Bach https://stackoverflow.com/questions/46058304/random-numbers-external-sort/46061122#46061122 – Dave Dec 27 '20 at 22:55

3 Answers3

0

This is something you have to benchmark. You can use fancy data structures (van Emde Boas trees) and sorting algorithms (some counting sort variant) to drop the theoretical expected complexity of both algorithms to something closer to linear. However, it's unclear that any such trick can improve the practical performance of either algorithm. Shenanigans for improving memory locality will probably make a bigger difference.

tmyklebu
  • 13,915
  • 3
  • 28
  • 57
  • see my answer. This is an exercise in CLRS, and the author specifically was asking for certain tricks. – xdavidliu Dec 26 '20 at 03:09
  • @xdavidliu Insofar as this is a pure theory exercise (for which cs.stackexchange.com would be a better forum), consider that Kruskal can be made to run in linear time if its input is a tree. Out of the box, Karger-Klein-Tarjan '95 gets you the right tree in expected linear time. But the random input admits an even easier algorithm: Deal with degree-1 and degere-2 vertices, then do min spanning forest on the subgraph with edges of value < |V|/|E|, of which there are |V|+O(sqrt(|V|)), then contract its components, of which there are – tmyklebu Dec 26 '20 at 20:17
  • If the input is a tree, the MST is just the tree itself, right? – xdavidliu Dec 26 '20 at 20:48
  • Yes, but you're required to run either Kruskal or Prim in this question. :) – tmyklebu Dec 26 '20 at 20:58
0

The distribution of edge weights does not matter.

The main difference between Prims and Kruskals is that Prim's algorithm runs in time proportional to the square of the number of vertices, while Kruskal's algorithm runs proportionately to the product of the edge count and logarithm of the edge count. Therefore, Prim's is faster on dense graphs and Kruskal's is faster on sparse graphs.

For example, if you have 1000 vertices 3000 edges (sparse), then Prim will be K1 * 1,000,000 and Kruskal will be K2 * 24,000. But if you have 1000 vertices and 250,000 edges (dense) then Kruskal will be K2 * 3,100,000.

Tyler Durden
  • 11,156
  • 9
  • 64
  • 126
  • Using Fibonacci heaps (http://en.wikipedia.org/wiki/Prim's_algorithm#Time_complexity), Prim's runtime can be further reduced to E + V Log V. – Santhosh Jun 09 '13 at 17:56
  • This guy has one bronze badge, he is not about to start writing FH implementations, and besides Kruskal's algorithm can always beat Prim's on a sufficiently sparse graph regardless of what data structure you use. – Tyler Durden Jun 09 '13 at 18:17
  • 4
    @TylerDurden Being a new user on SO does not imply being new to computer science and/or programming. Also, unless you mean it in a strictly informal sense, which definition of `dense` and `sparse` did you have in mind? I realize the classes of graphs for which either algorithm is faster can be derived from the complexities of the used variants of both algorithms, but it would be a nice addition to your answer to expressly specify those classes. – G. Bach Jun 09 '13 at 23:02
  • If it is not obvious to the OP whether his graph is sparse or dense then it probably not make a significant difference to the execution time which algorithm is chosen. – Tyler Durden Jun 09 '13 at 23:25
  • So basically what this answer is saying is that the fact that edge weights have a certain distribution is irrelevant? – Maria Ines Parnisari Dec 28 '16 at 20:02
  • This answer is wrong; the distribution of weights _does_ matter: this is a starred exercise in CLRS, so I doubt they threw out the weights as a red herring. Also, sparse / dense also does not matter. See my answer here. – xdavidliu Dec 26 '20 at 03:07
  • @TylerDurden the OP's question is an exercise from a chapter in CLRS which uses Fibonacci heaps and union-find w/ path compression extensively. However, that's moot because Kruskal with radix sort is O(E) here, as I show in my answer, so even Fib heaps are inferior – xdavidliu Dec 26 '20 at 03:17
0

Update 2 As @David Eisenstat points out in a comment below, a much simpler way to sort in O(E) time is to do bucket sort with |E| buckets. The interval [0, 1) can be divided into |E| buckets of length 1/|E| each, numbered 0, 1, 2 ... |E|-1, and any weight w in the interval belongs in bucket number k = floor(|E| w). The expected number of weights in each bucket is O(1), hence sorting can be done with |E| insertion sorts of size O(1) each, so this gives an O(E alpha(V)) expected time Kruskal algorithm.

Note: as @G. Bach points out, the above assumes that comparison of weights and the floor(|E| w) floating-point multiplication can be performed in O(1) time, which may require a certain suspension of disbelief. For very large values of |E|, both of these operations may still have an O(lg E) contribution.

Update As G. Bach pointed out below, the bin size after the first round of radix sort with O(1) digits will always be Omega(E), so the below answer technically is not guaranteed to sort in O(E) time. However, it may be possible to choose the number of digits to be something smaller than O(lg E), perhaps O(lg lg E) or O(sqrt lg E)? So that sorting takes less than O(E lg E) expected time.

original answer

This is exercise 23.2-6 in CLRS. I'm pretty sure that Kruskal would be faster (and thus the other answers here are wrong). The distribution of weights does matter; it is the density / sparsity of the graph that does not matter.

The ordinary version is dominated by the O(E lg E) time from sorting the edge weights. When the edge weights are drawn from uniform distribution, we can perform a radix sort on the some constant number of digits, then fix collisions in contiguous subarrays with further rounds of radix sort. That's O(E) time.

Then, the rest is just ordinary Kruskal: use disjoint sets with union rank and path compression (as in chapter 23 of CLRS) the remaining work is O(E alpha(V)), where alpha(V) is the inverse Ackermann function and is <= 4 for any sane value of V (think numbers unimaginably larger than atoms in the universe).

Hence, with radix sort, Kruskal is linear O(E) with probability arbitrarily close to 1.

Note on radix sort:

The expected number of collisions (i.e. edge weights whose first 15 digits are the same) can be made arbitrarily small by using more digits, but will be O(1) provided that the number of digits is O(lg E). Of course, this would imply O(E lg E) radix sort, which would defeat the purpose. However, we don't actually need to avoid collisions entirely, only limit their size so they can be fixed in linear time.

Hence we may consider sorting some constant number of digits (like 15) in one "round" of radix sorting, which separates the array of weights into contiguous subarrays (a.k.a. "bins") all with the same 15 digits, then in the next round, sort digits 16-30 of each subarray using a second "round" of radix sorting.

A formal proof would involve similar calculations as the Birthday paradox, but since the probability of collisions decreases exponentially with the number of additional digits used, it should be possible to completely sort the above using O(1) "rounds", which would result in O(E) total sorting time.

xdavidliu
  • 2,411
  • 14
  • 33
  • +1 This answer is pretty dense. The radix sort for a fixed number of digits is O(E), but it has a (because of uniform dist.) tiny chance of leaving some weights in the wrong order because they have the same prefix of digits that are used for radix sorting ("radix digits"). If I understood correctly, you say to solve that by selection sorting the weights that have the same radix digits before actually doing the radix sort; how do you do that in O(E)? Also, I don't think the expected number of radix-sort collisions is always O(1), shouldn't that depend on E and the number of radix digits? – G. Bach Dec 26 '20 at 13:14
  • Thanks, that cleared some of it up. Thinking about it some more, the average "bin size" will be E/(2^(number of digits used for radix sorting)) regardless of input, which is still Ω(E) for any constant number of digits used for radix sorting, and at some point you'll have to either comparison-sort or account for word length for radix sort. I don't think radix sort ultimately helps the time complexity, but has a good chance of improving performance anyway for the reasons you point out - a uniform distribution is almost guaranteed to give us evenly-distributed weights, which helps radix sort. – G. Bach Dec 26 '20 at 14:29
  • @G.Bach That's an excellent point about the Omega(E) expected bin size for O(1) digits. So using O(lg E) digits would certainly get expected bin size down to O(1), but then radix sort itself would take O(E lg E) and defeat the purpose. However, I wonder if it would be possible to use something in between, maybe O(lg lg E) digits or something, so that expected running time to sort is still little o(E lg E)? – xdavidliu Dec 26 '20 at 14:54
  • This problem now reminds me of perfect static hashing (the FKS algorithm) where the idea is basically to sample hash functions until the total number of collisions is linear (i.e. there are at most sqrt(E) many elements with the same hash for any given hash), then hash each bucket again into space that's quadratic in the bucket size; but since each bucket is at most sqrt(E) large, that still works out to linear total effort in an expected O(1) many hash functions attempted to use for it. Maybe here this might translate to (lg E) ^ (1/2) being the "right" number of radix bits? – G. Bach Dec 26 '20 at 15:10
  • That would leave us with expected bin sizes of Theta(sqrt(E)) by radix sorting in O(E (lg E)^(1/2)) time, and each of the bins could then be sorted in O(sqrt(E) lg sqrt(E)) = O(sqrt(E) lg E) time, but since we still have sqrt(E) bins we end up at O(sqrt(E) * sqrt(E) lg E) = O(E lg E) again. Hm. – G. Bach Dec 26 '20 at 15:13
  • Use |E| bins and insertion sort on each bin (or any sort that compares each pair of elements at most once). The probability that two edges end up in the same bin is 1/|E|, so the expected number of comparisons is at most (|E| choose 2) (1/|E|) = O(|E|). – David Eisenstat Dec 26 '20 at 18:21
  • @DavidEisenstat How can we distribute to |E| bins in less than lg E time per element? With radix sort we'd have to look at a prefix of lg E size to determine the bin unless I'm missing something. – G. Bach Dec 26 '20 at 20:10
  • @G.Bach I think we can take the internal [0,1) and divide it into E intervals of length 1/E, indexed as 0, 1, 2, 3 ... |E|-1. Weight w falls into interval with index = floor(E w). Are you concerned that it would take at least O(lg E) time to perform this multiplication? If yes, the multiplication can probably be speeded up with an FFT, or even done approximately instead of exactly. – xdavidliu Dec 26 '20 at 20:51
  • 1
    @xdavidliu By now I think the machine model is too foggy for me to figure out what to do with this question; do comparisons/arithmetic operations still take O(1) if there are unboundedly many digits in a number? That doesn't seem appropriate, especially if on the other hand we take the word length into account when analyzing the radix sort complexity; we wouldn't be measuring with the same yardstick. Should we bound the number of digits to some function of E? Not sure. – G. Bach Dec 26 '20 at 21:11
  • 1
    Those are all very valid points. I'm pretty sure the bucket sort solution is what the authors of CLRS intended here, but if it really does require unrealistic assumptions like O(1) floating-point arithmetic and comparisons of arbitrary precision, then this may warrant an entry in their errata. However, the statement of the problem is pretty open-ended, and they never really claim that O(E) Kruskal is possible, so CLRS still reserves the right of plausible deniability. – xdavidliu Dec 26 '20 at 21:24