5

I'm looking for ambush stop codons. I've gotten my code to the point where I'm extracting the sequences I need from my embl files. However I'm a bit stumped on how to add two upstream and two downstream nucleotides so I end up having -2,-1, 0, 1, 2 reading frames.

for rec in SeqIO.parse("CP002701.embl", "embl"):
    if rec.features:
        for feature in rec.features:
            if feature.type == "CDS":
                print(feature.location)
                print(feature.qualifiers["protein_id"])
                print(feature.location.extract(rec).seq)      

is the section I want to change but not sure how to change .location in order to select the extra 4 bases that I am interested in.

merv
  • 67,214
  • 13
  • 180
  • 245

1 Answers1

1

@user7550844 (OP) wrote on Feb 12 '17 at 15:46:

After some help from mofoniusrex on reddit here's a solution that works:

for rec in SeqIO.parse("CP002701.embl", "embl"):
if rec.features:
    for feature in rec.features:
        if feature.type == "CDS":
            pad=2
            newloc = SeqFeature.FeatureLocation( feature.location.start - pad,feature.location.end + pad)
            newfeature=SeqFeature.SeqFeature(location=newloc,
                 type=feature.type,
                 strand=feature.strand,
                 ref=feature.ref,
                 ref_db=feature.ref_db)
            newfeature.qualifiers = feature.qualifiers
            print(newfeature.qualifiers)
            print(newfeature.location)
            print(newfeature.location.extract(rec).seq)
merv
  • 67,214
  • 13
  • 180
  • 245