How can I save a dendrogram generated by scipy into Newick format?
Asked
Active
Viewed 2,645 times
1 Answers
22
You need the linkage matrix Z, which is the input to the scipy dendrogram function, and convert that to Newick format. Additionally, you need a list 'leaf_names' with the names of your leaves. Here is a function that will do the job:
def get_newick(node, parent_dist, leaf_names, newick='') -> str:
"""
Convert sciply.cluster.hierarchy.to_tree()-output to Newick format.
:param node: output of sciply.cluster.hierarchy.to_tree()
:param parent_dist: output of sciply.cluster.hierarchy.to_tree().dist
:param leaf_names: list of leaf names
:param newick: leave empty, this variable is used in recursion.
:returns: tree in Newick format
"""
if node.is_leaf():
return "%s:%.2f%s" % (leaf_names[node.id], parent_dist - node.dist, newick)
else:
if len(newick) > 0:
newick = "):%.2f%s" % (parent_dist - node.dist, newick)
else:
newick = ");"
newick = get_newick(node.get_left(), node.dist, leaf_names, newick=newick)
newick = get_newick(node.get_right(), node.dist, leaf_names, newick=",%s" % (newick))
newick = "(%s" % (newick)
return newick
tree = hierarchy.to_tree(Z, False)
get_newick(tree, tree.dist, leaf_names)