1

I am writing a program that takes in input from a file and each line may contain "ATG" or "GTG" and I am pretty sure I have done everything right as far as what I am trying to do. IT is my first time using a generator in python and after researching this problem I still don't Know why I am getting stop iteration. For this, my generator must yield a tuple with the start locations for either ATG or GTG found in each string.

import sys

import p3mod


gen = p3mod.find_start_positions()
gen.send(None)   # prime the generator

with open(sys.argv[1]) as f:
    for line in f:
        (seqid,seq) = line.strip().lower().split()
        slocs = gen.send(seq)
        print(seqid,slocs,"\n")

gen.close()  ## added to be more official

This is the generator

def find_start_positions (DNAstr = ""):

    DNAstr = DNAstr.upper()

    retVal = ()
    x = 0
    loc = -1

    locations = []

    while (x + 3) < len(DNAstr):

        if (DNAst[x:x+3] is "ATG" or DNAstr[x:x+3] is "GTG" ):
            loc = x

        if loc is not -1:
            locations.append(loc)

        loc = -1

    yield (tuple(locations))

This is the error:

Traceback (most recent call last):
  File "p3rmb.py", line 12, in <module>
    slocs = gen.send(seq)
StopIteration
Tyler Dunn
  • 31
  • 6

4 Answers4

2

You made a generator that returns all the data at once. You should yield the data in each iteration. This code might not be perfect, but it might solve part of your problem:

def find_start_positions (DNAstr = ""):
    DNAstr = DNAstr.upper()

    x = 0
    loc = -1

    while x + 3 < len(DNAstr):
        if DNAst[x:x+3] == "ATG" or DNAstr[x:x+3] == "GTG" :
            loc = x

        if loc is not -1:
            yield loc

        loc = -1

The StopIteration isn't an error. It's the way a generator signalizes that it exhausted all of it's data. You just need to "try except" it or use your generator in a forloop that already does that for you. Despite they aren't THAT complicated, it may take some time to get used to these "weird" errors. ;)

Jayme Tosi Neto
  • 1,189
  • 2
  • 19
  • 41
  • 1
    Don't use `is` to compare strings. – juanpa.arrivillaga Jun 20 '17 at 23:53
  • 1
    @juanpa.arrivillaga Sure man! Haven't noticed that, focused completely in the generator stuff. Thank you! – Jayme Tosi Neto Jun 20 '17 at 23:58
  • Is the variaable DNAstr changed everytime gen.send() is used in the main function? – Tyler Dunn Jun 21 '17 at 00:43
  • Also do I need to clear out the lists and stuff after every yield? – Tyler Dunn Jun 21 '17 at 00:46
  • @TylerDunn The variable is changed only inside the method because immutable arguments like strings, integers and tuples are passed by value not by reference. So the changes you do are discarded. Not necessarily. It's good to use the less resources and variables as possible, but it's not mandatory. But if you clear wrong stuff, the generator may present some weird behavior. – Jayme Tosi Neto Jun 21 '17 at 09:07
2

Your generator is built to return only one value in its entire lifetime. It iterates through the while loop, finds all of the locations, and returns that entire list in one fell swoop. Thus, when you call send a second time, you have exhausted the generator's operations.

You need to figure out what you expect from each invocation of send; configure your loop to produce just that much, and then yield that result ... and keep doing that for future send invocations. Your yield statement has to be inside a loop for this to work.

Jayme gave you an excellent example in his answer.

Prune
  • 76,765
  • 14
  • 60
  • 81
0

There is a built-in find() function to find a substring in a given string. Is a generator really necessary here?

Instead, you could try:

import sys

with open(sys.argv[1]) as f:
    text = f.read()

for i, my_string in enumerate(text.strip().lower().split()):
    atg_index = my_string.find('atg')
    gtg_index = my_string.find('gtg')
    print(i, atg_index, gtg_index, my_string)
Aditya Barve
  • 1,351
  • 1
  • 9
  • 13
0
def find_start_positions (DNAstr = ""):
    DNAstr = DNAstr.upper()

    x = 0
    loc = -1

    while x + 3 < len(DNAstr):
        if DNAst[x:x+3] == "ATG" or DNAstr[x:x+3] == "GTG" :
            loc = x

        if loc is not -1:
            yield loc

        loc = -1
cegprakash
  • 2,937
  • 33
  • 60