0

I have a list of permutations of the DNA sequences where the alignment score of the sequence pairs is obtained. I don't know why this process is causing memory leak when the permutation list is big, because the aligner object has created in each interation. Here example of score calculation:

for sequence1, sequence2 in sequence_permutation:
   score = self.__calculate_sequence_similarity(sequence1, sequence2)
   alignments[sequence1].append(sequence2)

save_aligments(alignments)

def __calculate_score_alignment(self, sequence1, sequence2): 
   from Bio.Align import substitution_matrices
   from Bio import Align
   from Bio.SubsMat import MatrixInfo

   aligner = Align.PairwiseAligner()
   aligner.mode = 'local'
   aligner.substitution_matrix = substitution_matrices.load('BLOSUM62')
   return aligner.score(sequence1, sequence2)


def __calculate_sequence_similarity(self, sequence1: str, sequence2: str) -> float:         
   if not sequence1 and not sequence2:
      return -1

   score = self.__calculate_score_alignment(sequence1, sequence2)
   score1 = self.__calculate_score_alignment(sequence1, sequence1)
   score2 = self.__calculate_score_alignment(sequence2, sequence2)

   return score / (math.sqrt(score1) * math.sqrt(score2))
Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
Kadu
  • 343
  • 7
  • 17
  • what is `alignments`? A dict / defaultdict? Could this be the cause of the memory increases? – Chris_Rands Jan 10 '21 at 09:58
  • Alignments is a dict, but alignments aren't the problem, because I tried to save one by one e the memory increased a lot. I supposed there is a internal object that is not being destroyed correctly. So, I decided to use another lib (skbio) and the problem resolved! Thanks! – Kadu Jan 11 '21 at 22:01

0 Answers0