if I have 3 DNA sequences and I want to evaluate them by some functions:
seq1='AG_CT'
seq2='AG_CT'
seq3='ACT_T'
How can I calculate the consensus score and the weighted sum of pairs score (WSP score) of this three DNA sequences in python?
consensus score is the sum of the pairwise score between sequences and the consensus sequence, Consensus (A)=sum{l}^{i=1}d(i) l is the lenght of sequence , d is the distance between two bases, example: d(A,B)=2 for A!=B, d(A,-)=d(-,A)=1 for A!='-',0 else. A and B may be 'A or C or G or T ' for the above example
we calculate distance between seq1 and seq2 then seq1 and seq3 then seq2 and seq3
**seq1 and seq2:**
d(A,A)=0, d(G,G)=0, d(-,-)=0, d(c,c)=0, d(t,t)=0
**seq1 and seq3**:
d(A,A)=0, d(G,C)=2, d(-,T)=1, d(c,-)=1, d(t,t)=0
**seq2 and seq3**:
d(A,A)=0, d(G,C)=2, d(-,T)=1, d(c,-)=1, d(t,t)=0
seq1= A G _ C T
seq2= A G _ C T
seq3= A C T _ T
0 0 0 0 0
0 2 1 1 0
0 2 1 1 0
++++++++++++++
0+ 4+ 2+ 2+ 0= 8
consensus(A)=8
weighted sum of pairs WSP (A) = \sum_{i=1}^{k-1} \sum_{j=i+l}^k \sum_{h=1}^l wij* s( A[ i,h ], [ j,h ] l :lenght of sequence, k number of sequences , wij weight of sequences i and j
s(A,B)=2 for A!=B, s(A,-)=d(-,A)=-1 for A!='-',3 else.all the weight factors are 1.
seq1= A G _ C T
seq2= A G _ C T
seq3= A C T _ T
3 3 3 3 3
3 2 -1 -1 3
3 2 -1 -1 3
++++++++++++++
(3+3+3)*1+(3+2+2)*1+(3-1-1)*1+(3-1-1)*1+(3+3+3)*1=9*1+7*1+1*1+1*1+9*1
9+7+1+1+9=27
So, WSP score of the three sequences is 27