-1

I'm new to Python coding, and I am having trouble making a function that turns a quality string into a list of PHRED-scaled quality scores. Hoping to get some assistance.

Here is a FASTQ read:

@SEQ_ID
AAGCGTCTGATCGGCAGAGGATACACATGCCGCACGTCGAGTATCTCGGC
+
=3:AAF>FGD1FCGGGGGFBGGGGCGGG1FE>>>E<:>/<9:CDGFG@GG

This is the function definition:

def quality_to_list(quality_string):
Maximilian Peters
  • 30,348
  • 12
  • 86
  • 99
john.doe
  • 11
  • 1

1 Answers1

1

BioPython has a couple of good examples and documentation on Phred scores.

from Bio import SeqIO
with open('tmp.fastq', 'w') as f:
    f.write("""@SEQ_ID
AAGCGTCTGATCGGCAGAGGATACACATGCCGCACGTCGAGTATCTCGGC
+
=3:AAF>FGD1FCGGGGGFBGGGGCGGG1FE>>>E<:>/<9:CDGFG@GG""")

for record in SeqIO.parse("tmp.fastq", "fastq"):
        print("ID: {0}\nPhred scores: {1}".format(record.id, record.letter_annotations['phred_quality']))

Output:

ID: SEQ_ID 
Phred scores: [28, 18, 25, 32, ..., 34, 35, 38, 37, 38, 31, 38, 38]
Maximilian Peters
  • 30,348
  • 12
  • 86
  • 99