I tried to solve another of Rosalind(http://rosalind.info/problems/lexv/) problems by myself, however unfortunately I have to ask you for help.
Here is my approach:
First of all; function which creates all possible substrings of input string with length of n:
def get_substrings(input_string, l):
res_list = []
sub = []
for i in range(len(input_string)):
if l+i <= len(input_string):
for j in range(i,l+i):
sub.append(input_string[j])
sub = ''.join(sub)
res_list.append(sub)
sub = []
res_list = filter(None, res_list)
return res_list
Then main function which creates all combinations of current string with varying length:
from itertools import product
def lexv():
dna = str(raw_input())
n = int(raw_input())
subs = get_substrings(dna, n)
result = []
for i in range(len(subs)):
for j in range(1,n+1):
result = result + list(product(dna, repeat=j))
for i in range(len(result)):
result[i] = "".join(result[i])
print result[i]
The result of my code for data from Rosalind's 'Sample input' is:
D
N
A
DD
DN
DA
ND
NN
NA
AD
AN
AA
DDD
DDN
DDA
DND
DNN
DNA
DAD
DAN
DAA
NDD
NDN
NDA
NND
NNN
NNA
NAD
NAN
NAA
ADD
ADN
ADA
AND
ANN
ANA
AAD
AAN
AAA
My questions:
a) Is it possible to order my result as it should be (Rosalind's result)?
b) Is my approach correct? If not, could you give me some clue (but not the solution for this problem - I'd like to beat it by myself).
Very thanks!