Construct an efficient, minimum spanning tree such that given subset of vertices in G are leaves + proof

Question

I am trying to design an algorithm where, given a connected weighted graph G = (V, E) and a subset of vertices U that is in V, will construct a minimum spanning tree such that all vertices in U are leaves (other vertices may also be leaves), or returns that no such tree exists (False).

This is all I got, adapting Prim's algorithm (fair warning, its really bad; don't even know if it works/is efficient or what data structures to use, I will accept literally any other correct algorithm instead):

Let x be an arbitrary node in G
Set S = {x}
While S != V:
    Let (u,v) be the cheapest edge with u in S and v not in S
    Add (u,v) to tree T if u is not in U, add v to S

If all u in U is in the tree T:
    return T
Else:
    return False

I also have a picture of what I think it would do to this graph I drew: pic here

A proof that the algorithm is correct would also give me some peace of mind.

Minor comment: `Let (u,v) be the cheapest edge with u in S and v not in S` should exclude vertices `u in G` that you have already used, otherwise they wouldn't become leaves. — Vincent van der Weele, Oct 15 '20 at 06:58
@VincentvanderWeele oh yes, thank you. Other than this, do you think this adaptation would work? — iheartcoding124, Oct 15 '20 at 12:52
I _think_ both your approach and the answer below work. The correctness of the other approach seems easier to prove, though. — Vincent van der Weele, Oct 15 '20 at 19:33
@VincentvanderWeele actually quick comment about your minor comment (lol): why should I exclude vertices u in G that I have already used? I already have a line where if an edge (u,v) has its node u that is in U, the edge wouldn't be added to the tree anyways. If this node that is in U is v instead, this edge would be okay to add to the tree, thus making it a leaf node. — iheartcoding124, Oct 16 '20 at 00:02
Ah, you have another if in the second line, making sure that `u` and `v` cannot be both in `U`. I missed that. — Vincent van der Weele, Oct 16 '20 at 08:20

גלעד ברקן · Accepted Answer · 2020-10-15T15:51:50.087

3

If all vertices u ∈ U are to be leaves in a solution, no u can be used in that solution to connect two other vertices. All vertices not in U must be connected by edges not incident to any u.

Remove U and all edges incident to U. Find the minimum spanning tree, then connect each u to the tree by the smallest-weighted edge available from those we removed.

edited Oct 15 '20 at 15:51

answered Oct 15 '20 at 11:11

גלעד ברקן

23,602
3
25
61

I don't think this would work, specifically the "removing all edges incident to u" part. If we do, we could get a disconnected graph, and wouldn't be able to find a minimum spanning tree of the nodes left in the graph. – iheartcoding124 Oct 15 '20 at 22:14
Also, I don't think this is particularly efficient, especially if you are storing the tree as a heap in a priority queue. – iheartcoding124 Oct 15 '20 at 22:15
@iheartcoding124 please provide a very simple example that has a valid solution, where this won't work. – גלעד ברקן Oct 15 '20 at 22:21
@iheartcoding124 as for efficiency, I'd say it seems *more* efficient since finding the minimum spanning tree on a smaller graph takes less resources. – גלעד ברקן Oct 15 '20 at 22:22
oof wait never mind, I was looking at my example that I took a pic of, and forgot to remove the edge incident to D that is connected to B, so I thought I had a disconnected graph for a second lol. – iheartcoding124 Oct 15 '20 at 22:28
As for runtime, I was thinking of storing the graph as an adjacency list, where each node is a key and its neighbors are its values; these values would be a list of the node and its corresponding weight (like if C had a neighbors E and W, the entry for C in the dictionary is {C:[[E, w(E)], [W, w(W)]]}. Then, for each node in U, look up the key and remove it, then store it & its value in another list/dict. After MST, iterate through list/dict and find the neighbor with the cheapest weight for each entry, find this neighbor in the heap, and set the neighbor's neighbor to be the entry. – iheartcoding124 Oct 15 '20 at 22:34
I think searching for a node in a heap is O(log(n)), and I guess adding the entry as its neighbor is just O(1), since it is already a leaf node so there wouldn't be any heapifying? – iheartcoding124 Oct 15 '20 at 22:36
@iheartcoding124 I believe searching in a simple heap would have higher worst case complexity than O(log n). I know of an indexed heap, which i think can have O(1) search for the indexed element. I can't speak to the last step of adding the entry without better understanding the solution. I'm not familiar with MST implementations, generally. – גלעד ברקן Oct 16 '20 at 01:35
1

Some observations here why this is correct. 1) When removing a leaf from a tree, the result is still a tree. So if there exists a MSP with all U leaves, there also exists a MSP on `V \ U`. 2) For each `u in U`, the edge connecting it to the MSP of `V \ U` is the edge with the lowest weight. This is true because there can be no edge between two vertices in `U` (not all `U` are leaves) and the weight cannot be higher by contradiction (switching the edge would give a tree with lower total weight). 3) From the previous 2 points, the spanning tree of `V \ U` must be a MSP as well, by contradiction. – Vincent van der Weele Oct 16 '20 at 08:17
1

@iheartcoding124 (you may be interested in Vincent van der Weele's comment just above. I thought I'd mention it since they didn't tag you.) – גלעד ברקן Oct 16 '20 at 10:15
@גלעדברקן thanks! I ran out of my 600 characters :) – Vincent van der Weele Oct 16 '20 at 11:04

Construct an efficient, minimum spanning tree such that given subset of vertices in G are leaves + proof

1 Answers1