1

I have ~13K sequences a 120 bases and I want to compare them to find things like conserved regions, a mean divergence between them or very diverging outliers.

The problem is, with this number of sequences the things I tried aren't doable.

So has anyone done something similar in this size and can give me some hints how to achieve it? Or maybe just some tips where I should look for?

zx8754
  • 52,746
  • 12
  • 114
  • 209
voiDnyx
  • 975
  • 1
  • 11
  • 24

1 Answers1

2

Use the dnadist program of the PHYLIP package. You have some help in the Biopython library to deal with the Phylip alignment format here.

xbello
  • 7,223
  • 3
  • 28
  • 41