I am trying to discover text encoded in ASCII found within a DNA sequence from a file.
Below is my code:
The first is to open the FASTA file and set is a variable.
with open("/home/<username>/python/progseq") as mydnaseq:
sequence = mydnaseq.read().replace('\n','')
This second bit is to translate the sequence into binary and did this for the letters C and G/T to equal 1:
binarysequence = sequence.replace('A','0')
Then I took this loooooong binary sequence and wanted to make it into 8bits:
for i in range(0,len(binarysequence),8):
binarysequence [i:i+8]
This then created an output like this:
'00110100'
'00110010'
'01000110'
'00011000'
'0'
Though I had a much longer output I only included the last four of the sequence.
Wanted to know how to convert this into letters.