0

my tree building tool likes binary trees. in order to get such trees it often introduces super small branches to keep it in a binary structure.

this is super annoying for me when i try to compare trees since those small branches introduce splits that should not be there.

is there a easy way using ete3 (or some other library) to clean the trees of branches if their branch length is less than a specified limit?

As an example, let the branch length from root to AB be smaller than the limit:

      /-A
   /-|
  |   \-B
--|
  |     /-C
   \---|
        \-D

then the resulting tree should like this:

   /-A
  |
  |--B
--|
  |     /-C
   \---|
        \-D

i tried it like this:

from ete3 import Tree

tree = "((A:0.1,B:0.2):0.005,(C:0.3,D:0.4):0.009);"


t1 = Tree(tree, quoted_node_names=True, format=1)


limit = 0.006

for node in t1.iter_descendants():
    if node.dist <= limit:
        nn = node._children
        nodelist = []
        for n in nn:
            nodelist.append(n.name)
        for n in nodelist:
            parent = node.up
            remove = t1.search_nodes(name=n)
            remove[0].delete()
            # parent._children.append(remove)



print(t1)

resulting in this tree:

        /-C
-- /---|
        \-D

so i manage to cut off the A and B leaves - but i fail to attach them at the upper node.

is this a valid strategy to achieve this?

if not, how should i tackle this problem?

thank you very much in advance,

best,

t.

tristan
  • 105
  • 1
  • 2
  • 12

1 Answers1

0

oh it was way easier than expected:

for node in t1.get_descendants():
    if not node.is_leaf() and node._dist <= limit:
        node.delete()

does it.

tristan
  • 105
  • 1
  • 2
  • 12