Sample Input:
ACGTTGCATGTCGCATGATGCATGAGAGCT # this is the sequence in which we have to search
4 # this is the k-mer(integer value)
Sample Output:
CATG GCAT
I do not understand how to do this. Please help me. Thanks in advance.
Sample Input:
ACGTTGCATGTCGCATGATGCATGAGAGCT # this is the sequence in which we have to search
4 # this is the k-mer(integer value)
Sample Output:
CATG GCAT
I do not understand how to do this. Please help me. Thanks in advance.
If I understand your question correctly, here is one way to work through the list:
s="ACGTTGCATGTCGCATGATGCATGAGAGCT"
n=4
k=len(s)-2*n
klist = []
for i in range(k):
kmer=s[i:i+n]
if not(kmer in klist) and (kmer in s[i+n:]):
klist.append(kmer)
print klist
It looks like your example had a few more kmers that expected, unless I am misunderstanding:
['TGCA', 'GCAT', 'CATG', 'ATGA']
For n = 5
:
['TGCAT', 'GCATG', 'CATGA']
And even for n = 6
:
['TGCATG', 'GCATGA']