2

I've built a tree image, see question.
Now I've some major groups.
One group has nodes with color green and brown, and has a 'B' and a 'A'. The second group has only pink nodes and 'T' and the last group has yellow, orange and blue, and the letters 'L', 'X' and 'H'. the colors refer to the colors of the nodes and the letters belong to the name. So I want to color the edge of the different groups.

#taken from draw_graphviz
def get_label_mapping(G, selection): 
    for node in G.nodes(): 
        if (selection is None) or (node in selection): 
            try: 
                label = str(node) 
                if label not in (None, node.__class__.__name__): 
                    yield (node, label) 
            except (LookupError, AttributeError, ValueError): 
                pass


labels = dict(get_label_mapping(G, None))
for label in labels.keys():
if str(label) != "Clade":
        num = label.name.split('-')
        if 'T' in num[0]:
            node_colors.append('#CC6699')
        elif 'X' in num[0]:
            node_colors.append('r')
        else:
            node_colors.append('y')

so I've done a similar function to the above, instead of node, I changed to get_edge. and try this:

  for edge in edges.keys():
        if str(edge) != "Clade":
            if 'T' in edge:
                edge_colors.append('b')

where edge is:

(Clade(branch_length=-0.00193, name='T-7199-8'), Clade(branch_length=0.00494))

perhaps there's a way to say if T is in name, then color the edge. What to you think?

anyone knows how to do this?

Thank you

Community
  • 1
  • 1
psoares
  • 4,733
  • 7
  • 41
  • 55
  • When you say "if a T is in the name, then color the edge", do you mean the name of the *edge*, or the name of a node (or both nodes) incident on the edge? Also, can you please show the full for-loop? It will be easier to help you if we can see what exactly you're iterating over. – gotgenes Dec 06 '10 at 17:05
  • T is in the name of the node, so I though doing something like: if T is in the node, color the edge. The loop was in the other question but I gone update my question and provide a bit more code :) – psoares Dec 06 '10 at 17:16
  • with the node_colors I'm changing the colors of the node, what I would like if for instance all nodes that have red color, have an edge blue, or something similar. Hope I made myself clear enough – psoares Dec 06 '10 at 17:21
  • Thanks for clarifying the question of "name". Could you now explicitly state the conditions on which you color an edge? For example, do you color an edge blue so long as *either* one of its nodes' names start with "T", or only if *both* do? What happens if one node's name starts with "T" and another's starts with "X": do you color it blue, red, or neither? Be specific. Try to think of exceptional cases. – gotgenes Dec 06 '10 at 18:32
  • I updated my question with a small explication for the colors and the names. the colors represent different clusters, so they aren't suppose to be mixed – psoares Dec 06 '10 at 18:45
  • @pavid: I've updated my answer to have a shot at this situation (assuming that BioPython Clades have a `.name` attribute). Have a look at it. – Thomas K Dec 06 '10 at 18:56
  • @pavid You still did not give the conditions under which you decide to color an edge, and what color you want it colored. Please try to answer the questions in my previous comment, starting at "For example...", and write any other specifications you think we should know. – gotgenes Dec 06 '10 at 21:55

1 Answers1

4

I'm guessing (since I don't know how that snippet fits in to the rest of the code) that you're iterating over the nodes, and adding one colour to the list for each node. Like the error message suggests, you need to work out the colour required for each edge. That's going to be trickier.

Alright, got it! The code could be tidied up a bit, but this works.

#Define your centre node: you need to pull this out of the graph. Call it b.
# The number changes each time: look for a Clade(branch_length=0.03297)
# Its neighbors have branch lengths .00177, .01972, .00774.
b = G.nodes()[112]

# Recursively paint edges below a certain point, ignoring ones we've already seen
def paintedges(graph, startnode, colour):
    for node in graph.neighbors(startnode):
        if node not in alreadyseen: # alreadyseen is in global scope
            graph[startnode][node]["colour"] = colour
            alreadyseen.add(node)
            paintedges(graph, node, colour)

alreadyseen = set([b])
G[b][G.neighbors(b)[0]]["colour"] = "red"
paintedges(G, G.neighbors(b)[0], "red")
G[b][G.neighbors(b)[1]]["colour"] = "blue"
paintedges(G, G.neighbors(b)[1], "blue")
G[b][G.neighbors(b)[2]]["colour"] = "green"
paintedges(G, G.neighbors(b)[2], "green")

# Now make a list of all the colours, in the order networkx keeps the edges
edgecolours = [G[f][t]["colour"] for f,t in G.edges()]
kwargs["edge_color"] = edgecolours

Tree with colours

Thomas K
  • 39,200
  • 7
  • 84
  • 86
  • that's right, I'm iterating over the nodes. But I'm not really sure how I get the edges from the groups I want – psoares Dec 06 '10 at 14:04
  • From your previous code, BioPython is generating the edges for you. Unless it has the ability to colour the edges when generating the networkx graph, you'll need to get the edges yourself `for from, to in thegraph.edges()`. See the networkx documentation: http://networkx.lanl.gov/tutorial/tutorial.html#edge-attributes – Thomas K Dec 06 '10 at 14:15
  • ok, so I have to do something like for edge in G.edge() but then how I'm able to say the edge belonging to node 1 and 2 should be red. I've yellow, magenta, green, blue, brown nodes. And I would like to say color_edge=red if node_color =blue and green for instance – psoares Dec 06 '10 at 14:38
  • @pavid: I've added a rough example. I haven't tested it, and you may need to adapt it to your purposes. – Thomas K Dec 06 '10 at 14:49
  • Instead of iterating through every edge, checking to see if you should color it, and which color you should color it as, it may be much more efficient to induce a subgraph with `Graph.subgraph(nbunch)`, then iterate through *those* edges, setting the color attribute to the color you want for that group. – gotgenes Dec 06 '10 at 14:59
  • @gotgenes: That's a much neater solution. Assuming it works, you should post it as a new answer. – Thomas K Dec 06 '10 at 15:15
  • thank you both, I'm still trying to work around your example @Thomas because it doesn't work right now – psoares Dec 06 '10 at 15:17
  • @pavid: I'm not entirely surprised. I could help you debug it, but I suspect you'd be better off using gotgenes' solution, anyway. – Thomas K Dec 06 '10 at 15:42
  • @Thomas K, Actually, I re-read the question, and he can iterate directly over the edges that need coloring, which would be most efficient. I asked some questions in the comments above which, when answered, should shed more light on what's going awry. – gotgenes Dec 06 '10 at 17:08
  • @gotgenes: It may be more efficient, but I suspect that using subgraph is simpler, bearing in mind that the graph we're talking about is not huge. – Thomas K Dec 06 '10 at 18:52
  • so I tried your code, it gives me this error: 'float' object does not support item assignment. fromnode is Clade(branch_length=-0.00111, name='T-ASC498/03-5'), tonode is Clade(branch_length=0.01118) – psoares Dec 06 '10 at 19:05
  • Hmmm, weird. One moment, I'll try running it myself – Thomas K Dec 06 '10 at 19:19
  • the files I provided in the other question doesn't have these names, since it was something I implement afterwards. But if you need something just let me known – psoares Dec 06 '10 at 19:27
  • @pavid: OK, I hit a different error, and I've updated it again. It runs, but doesn't work. Looking at your earlier question, I realise that the parameters of the graph itself don't affect drawing: you have to pass arguments to the drawing command separately. Also, I think this approach won't do what you expect: it will only colour the terminal branches going to each leaf of your tree. That'll need a bit more thought. – Thomas K Dec 06 '10 at 19:38
  • yeah it doesn't work. I gonna try other options and try to improve your code, see if it works – psoares Dec 06 '10 at 21:04
  • Also, in your example, you say an example edge is `(Clade(branch_length=-0.00193, name='T-7199-8'), Clade(branch_length=0.00494))`. Is there a reason why the second `Clade` instance has no `name`? Can you guarantee each `Clade` instance will be given a `name`, or not? – gotgenes Dec 06 '10 at 21:59
  • @gotgenes: I would guess only the terminal nodes have names, probably corresponding to the genes they represent. – Thomas K Dec 06 '10 at 22:47
  • @pavid: It's working at last. You can tidy the code up a bit, but there it is. – Thomas K Dec 06 '10 at 23:21
  • Thank you both for your help, and sorry to stop answering your questions but I didn't had internet at home – psoares Dec 07 '10 at 10:38