First, I'm not convinced you actually have a problem. But let's assume you do.
Can anyone suggest what else I could do to effectively delete the Tree object at the end of each iteration of the for loop?
You could try to figure out who's keeping a reference to it alive, and del
that too. I notice that you missed Root
; I'll bet that has a reference to either the Tree
object, or most of its data.
But the simple way to do it is to use scopes. Just refactor the loop body into a function, and all those variables created inside the loop become local variables inside the function, and they all go away when the function returns:
def do_tree_stuff(i):
Assignment_Tree = Tree()
Root = Assignment_Tree.get_tree_root()
# ...
Root.add_feature("name", i)
populate_tree() # this function extends the branches of the Tree and adds leaves
for leaf in Assignment_Tree.iter_leaves():
chain = []
score = leaf.dist
chain.append(leaf.name)
for ancestor in leaf.get_ancestors():
chain.append(ancestor.name)
for i in i_iminus1_pool_dict.keys():
do_tree_stuff(i)
As long as the function doesn't mutate any globals or closure cells, it can't possibly leave anything behind in its caller's locals. So you don't need to try to figure out what locals might have gotten modified and del
them; you know none of them got modified, and you don't have to do anything.
And if you want to refactor the inner loop into another function, go for it.
If you're retaining data that you shouldn't be—i.e., something in that loop is mutating something that lives outside the loop that has a reference to a leaf that has a reference to the root that has a reference to the whole tree—then that actually is a problem, and you need to fix it. But I can't see anything in your posted code that could be doing that.
But meanwhile, this still won't actually release memory to the OS. Once Python's allocated memory, it generally keeps it. But it will reuse it. If the first tree is garbage when you create the second tree, it'll put the second tree in the same memory as the first one. This is generally a much better thing to do than calling malloc
and free
all over the place—but, even in the rare cases when it isn't, you can't stop Python from doing it.
If you really do need to allocate and free memory repeatedly, you can always take that function you refactored and spin it off into a child process, using multiprocessing
. When a process goes away, all of its memory goes away. But most likely, that will just add overhead for no benefit.