2

EDITED

I created a function that introduces random mutations with a determined mutation rate (example 0.05).

input='ATCTAGGAT'


def mutate_v2(sequence, mutation_rate):
    dna_list = list(sequence)
    for i in range(len(sequence)):
        r = random.random()
        if r < mutation_rate:
            mutation_site = random.randint(0, len(dna_list) - 1)
            print(mutation_site)
            dna_list[mutation_site] = random.choice(list('ATCG'))
        return ''.join(dna_list)


## run the function
mutate_v2(input, 0.01)

Now I want the function to take as input a list of sequences (example: list_sequences = ['ATTCTGTA', 'TTCGCTAA', 'ACCCGCTA']) and return each mutated sequence (in a list: output).

Any help please!

Thanks

  • And you ran this how many times? – roganjosh Oct 14 '21 at 19:03
  • What exactly is mutation rate? If you're getting the same input sequence, it's probably because `r` is never below 0.01? – not_speshal Oct 14 '21 at 19:04
  • 1
    The probability of the draw being below 0.01 in just 9 tries is too low, that's why you're not observing mutations, just to convince you, try with r=0.5 – Ultramoi Oct 14 '21 at 19:05
  • @Ultramoi I edited the question, can you please help to get result for a list of sequences instead of juste one. Thanks –  Oct 14 '21 at 19:59

2 Answers2

2

I would keep the interface of mutate_v2 the same (I think the interface is OK the way it is), but call it using list comprehension, like so:

input = [seq_1, seq_2, seq_n]
mutated_input = [mutate_v2(s) for s in input]

Alternatively, you can wrap it into its own method like so:

def mutate_multiple(sequences):
    return [mutate_v2(s) for s in sequences]

# call the method:
input = [seq_1, seq_2, seq_n]
mutated_input = mutate_multiple(input)
Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
1

To return a list of sequences instead of just one, you simply have to call you function multiple times.

def multiple_draws(sequence, mutation_rate, n_draws=1):
    return [mutate_v2(sequence, mutation_rate) for _ in range (n_draws)]

print(multiple_draws(input, 0.01, 10)

And if it's easier to read for you:

def multiple_draws(sequence, mutation_rate, n_draws=1):
    mutations = []
    for _ in range(n_draws):
       mutation = mutate_v2(sequence, mutation_rate)
       mutations.append(mutation)
    return mutations
Ultramoi
  • 166
  • 1
  • 9
  • Thank you for your effort. But I'm looking for a function that accepts as input a list of sequences and not only one. for example **input=[seq_1, seq_2, seq_n]**, then it returns a new list **output=[seq_1_muted, seq_2_muted, seq_n_muted]** . –  Oct 14 '21 at 20:41
  • 1
    I see, then I think the answer from Timur Shtatland should fit your expectations :) – Ultramoi Oct 15 '21 at 11:24