1

I have this hw:

  1. Read edge list from a file
  2. Turn it into an adjacency list
  3. Output an unweighted, undirected spanning tree of the graph (we can assume starting point is at vertex 0)

I have problem getting 3. to output correctly. Namely, file 1 should output [[1],[0,2,3],[1],[1]] and I got [[1,2],[0,3],[0],[1]] which is sort of ok since they are both spanning trees for n=4 from file 1

but here's the major problem: I don't know what's wrong in my code, for file 2: I get: [[10], [], [10], [10], [], [], [], [], [], [], [0, 3, 2], [], []]

Data for the files at end of my code. (edit: starting from tree=[] is where the problems lie, the rest have no issues)

Here's my attempt at the problem:

import itertools

edge_i=[]
edge_j=[]
x = []
y = []
edgelist = []
n = int(input("Enter value for n:")) #input n for number of vertices
adjlist = [[] for i in range(n)] #create n sublists inside empty initial adjlist
data = [['0','1'],['2','1'],['0','2'],['1','3']]


for line in data: #for loop for appending into adjacency list the correct indices taken from out of the edgelist
    #(this line won't be needed when hardcoding input) line = line.replace("\n","").split(" ")
    for values in line:
        values_as_int = int(values)
        edgelist.append(values_as_int)



#set of vertices present in this file - pick out only n vertices
verticesset=set(edgelist)
listofusefulvertices=list(verticesset)[0:n]


P = list(itertools.permutations(listofusefulvertices,2))


x.append(edgelist[0::2])
y.append(edgelist[1::2])
x = sum(x,[])
y = sum(y,[])
dataint=zip(x,y)
datatuples=list(dataint)
outdata = set(datatuples)&set(P)
output=list(outdata)


for k in range(len(output)):
    edge_i.append(output[k][0])
    edge_i.append(output[k][1])
    edge_j.append(output[k][1])
    edge_j.append(output[k][0])

for i in range(len(edge_i)):
    u = edge_i[i]
    v = edge_j[i]
    adjlist[u].append(v)
print(adjlist)


tree = []
for vertexNum in range(len(listofusefulvertices)):
    tree.append([])
treeVertices = [0]*n
treeVertices[0]=1
for vertex in range(0,n): #(here the range in skeletal code from school used 1,n but it only worked for me when I used 0,n-1 or 0,n)
    if treeVertices[vertex] == 1:
        for adjVertex in adjlist[vertex]:
            if treeVertices[adjVertex] == 0:
                treeVertices[adjVertex]=1
                tree[adjVertex].append(vertex)
                tree[vertex].append(adjVertex)

print(tree)


#The data from files were: file 1: [['0','1'],['2','1'],['0','2'],['1','3']]
# file 2: [['10','2'],['7','4'],['11','3'],['1','12'],['6','8'],['10','3'],['4','9'],['5','7'],['8','12'],['2','11'],['1','6'],['0','10'],['7','2'],['12','5']]
Kenshin
  • 53
  • 1
  • 8
  • Please explain your notation. At first, I thought that the input was a list of graph edges, giving the endpoints of each edge. However, your output obviously isn't listing edges in that format. – Prune Mar 27 '17 at 23:28
  • Your initial thought on input is correct, Input are edges with endpoints of each edge, output are supposed to be an adjacency list with index number of the list being the vertex number and the elements of the index being the adjacent points the vertex is connected to. – Kenshin Mar 27 '17 at 23:45
  • Could you please hard-code the input data into your example (get rid of the input statements)? I've tried to reverse-engineer your input files; I've also tried hard-coding the input. I've spent enough time on that. I now have code that runs on supposed "file1" input, but I do not get the output you indicate, just [[], [3], [], [1]] – Prune Mar 28 '17 at 00:10
  • Sure, gimme a sec – Kenshin Mar 28 '17 at 00:12
  • Done, for file 2, just copy and replace the data from comments at bottom of code. I get [[1, 2], [0, 3], [0], [1]] when I copy this code with the hardcoded file1 into pythontutor's python 3.6 visualization. – Kenshin Mar 28 '17 at 00:19
  • Delete this line when hardcoding input as well: `line = line.replace("\n","").split(" ")` – Kenshin Mar 28 '17 at 00:26
  • Thanks; I've reproduced your problem now. I also removed the input for **n**, instead deriving it as the largest node number + 1. Now, to look for the problem ... – Prune Mar 28 '17 at 00:28

2 Answers2

2

I didn't wade through all your code, you really should look at the guidance Minimal, complete, verifiable example.

However, it is fairly simple to turn an edge-list into a graph, and then use a standard mst algorithm, e.g. Prim's:

def create_graph(edgelist):
    graph = {}
    for e1, e2 in edgelist:
        graph.setdefault(e1, []).append(e2)
        graph.setdefault(e2, []).append(e1)
    return graph

# Prim's
def mst(start, graph):
    closed = set()
    edges = []
    q = [(start, start)]
    while q:
        v1, v2 = q.pop()
        if v2 in closed:
            continue
        closed.add(v2)
        edges.append((v1, v2))
        for v in graph[v2]:
            if v in graph:
                q.append((v2, v))
    del edges[0]
    assert len(edges) == len(graph)-1
    return edges

>>> graph = create_graph([[10, 2], [7, 4], [11, 3], [1, 12], [6, 8], [10, 3], [4, 9], [5, 7], [8, 12], [2, 11], [1, 6], [0, 10], [7, 2], [12, 5]])
>>> min_gragh = create_graph(mst(0, graph))
>>> min_graph
{0: [10],
 1: [6],
 2: [11, 7],
 3: [10, 11],
 4: [7, 9],
 5: [7, 12],
 6: [8, 1],
 7: [2, 5, 4],
 8: [12, 6],
 9: [4],
 10: [0, 3],
 11: [3, 2],
 12: [5, 8]}
>>> [sorted(min_graph[k]) for k in sorted(min_graph)]
[[10], [6], [7, 11], [10, 11], [7, 9], [7, 12], [1, 8], [2, 4, 5], [6, 12], [4], [0, 3], [2, 3], [5, 8]]

There are potentially multiple valid MST for a graph, e.g. your smaller edgelist produces [[2], [2, 3], [0, 1], [1]], which is also a valid MST but different from your expected output.

Community
  • 1
  • 1
AChampion
  • 29,683
  • 4
  • 59
  • 75
  • Thanks for editting and giving this example. This problem requires the code to cater to generalise all input data of this format though. Not just one specific set of list of edges. – Kenshin Mar 27 '17 at 23:46
  • Sorry about including code from steps 1 and 2 as well, those have no issues. – Kenshin Mar 27 '17 at 23:51
  • The problem is to find a minimal (?) spanning tree, not merely to convert the input. – Prune Mar 28 '17 at 00:00
  • The graph is unweighted, any correct spanning tree will do, but if done correctly, (without using other modules like networkx), it should output a specific spanning tree according to the order of iterations in the loop I would imagine, so there's probably no "random factor" in selection of "next vertex", but just follow the order of the loop and check if the next vertex is already connected or not. I've editted my question by the way, sorry for not clarifying about this earlier: (edit: starting from tree=[] is where the problems lie, the rest have no issues) – Kenshin Mar 28 '17 at 00:03
  • This still doesn't do it -- but it's closer. Note that the output reports connectivity in *traversal* order, losing the original node numbering. Node 0 connects to 10; 10 connects to [0, 3]; 3 connects to [10, 11]; 11 connects to [2, 3]; etc. – Prune Mar 28 '17 at 01:22
  • I noticed I missed a `sorted(min_graph)`, fixed. – AChampion Mar 28 '17 at 01:23
1

The problem is in your main processing loop at the end. You use node 0 as the starting node, but then assume that your connectivity runs in numerical order. You flag all the nodes adjacent to node 0 (only node 10), and then take up node 1 next. That's not yet connected, so you skip it ... but you never come back.

Here's the code and trace from my low-tech debugging run:

for vertex in range(0,n): #(here the range in skeletal code from school used 1,n but it only worked for me when I used 0,n-1 or 0,n)
    print ("Working on vertex", vertex, treeVertices[vertex] == 1)
    if treeVertices[vertex] == 1:
        for adjVertex in adjlist[vertex]:
            print ("  Adjacent vertex", adjVertex, treeVertices[adjVertex] == 0)
            if treeVertices[adjVertex] == 0:
                treeVertices[adjVertex]=1
                tree[adjVertex].append(vertex)
                tree[vertex].append(adjVertex)

print("Spanning tree", tree)

Output:

Adjacency list [[10], [12, 6], [11, 7, 10], [11, 10], [9, 7], [7, 12], [8, 1], [5, 4, 2], [6, 12], [4], [0, 3, 2], [2, 3], [1, 8, 5]]

Working on vertex 0 True
  Adjacent vertex 10 True
Working on vertex 1 False
Working on vertex 2 False
Working on vertex 3 False
Working on vertex 4 False
Working on vertex 5 False
Working on vertex 6 False
Working on vertex 7 False
Working on vertex 8 False
Working on vertex 9 False
Working on vertex 10 True
  Adjacent vertex 0 False
  Adjacent vertex 3 True
  Adjacent vertex 2 True
Working on vertex 11 False
Working on vertex 12 False
Spanning tree [[10], [], [10], [10], [], [], [], [], [], [], [0, 3, 2], [], []]

See the problem? The algorithm assumes that the flow of finding the spanning tree will always succeed if you move from available nodes to higher-numbered nodes. Since this tree requires several "down" moves, you don't get them all. You start at 0, mark 10, and then skip nodes 1-9. When you get to 10, you add nodes 2 and 3 ... but you never go back to expand on them, and that's all you get.

To get them all, do one of two things:

  1. Put the "unexpanded" nodes on a "to do" list, starting with [0]. Your top for loop changes to a while that continues until that list is empty.
  2. Keep your current structure, but add an outer loop that continues until either all nodes are tagged, or no edges get added (no spanning tree found).

Does that get you moving to a solution?

Prune
  • 76,765
  • 14
  • 60
  • 81
  • Yes, thanks ! I'm on a train to class now, will go change my code once I get home in about 6 hours. – Kenshin Mar 28 '17 at 01:22