-1

I am trying to create a function where the user is able to input a file name (containing a DNA sequence), and the respective number of bases present in the selected file are counted and output onto the screen in the order: #A, #G, #C, #T. I then want to save that output as a new file under a user input name with extension .count. In bash I am then trying to concatenate those file (will be 100 in total) into a single .csv document with the following format:

File #A,#G,#C,#T
file.count 1 23,43,32,41
file.count 2 etc...
To open the file, I have:

def openseq(filename):
  filename=input("enter file to open: ")
  openfile=open(filename,"r")
  dnatext=print(openfile.read())
  return dnatext

and then originally I was trying to next a for loop within (under dnatext) with the following:

for i in dnatext:
    comma = ","
    numberofbases=str(dnatext.count('A')) + comma + str(dnatext.count('G'))   + comma + str(dnatext.count('C')) + comma + str(dnatext.count('T'))
  return numberofbases

And then to save the file under a new name input by the user:

directory="<desired directory>" #removed directory for privacy
  newname= input("Enter output file name: ")
  filepath = directory + newname + ".count"
  filepath.close()

But no matter how i move things around i either get the error message TypeError: 'NoneType' object is not iterable or that some variable is not defined. I've tried a few ways to try and resolve this but am just not having any luck and seeing as i am relatively new to coding (especially combination of python and bash) i would very much appreciate some help or even an explanation as to why I am unable to count the number of bases in the inputed sequence.

Ideally I am trying to get this all into 1 or 2 functions so I can call them easily in bash but I am not sure whether that is even possible.

Renaud Pacalet
  • 25,260
  • 3
  • 34
  • 51
heather_l
  • 11
  • 4
  • The `print` function returns nothing. Do the `read` and the `print` in separate lines. Next, since you have the whole text in one big block, you do NOT want to do `for i in dnatext:`, which will loop character by character. Just do the `count` calls without the loop. – Tim Roberts Mar 21 '22 at 22:38
  • You can use "split" rather than "for" loop, its more efficient. – Mabadai Mar 21 '22 at 22:56

1 Answers1

1

It's hard to tell without a MWE, but in general, "NoneType is not iterable" means (perhaps unsurprisingly) that you're trying to iterate over the value None.

In your code you posted there's only one place where you iterate:

for i in dnatext:

Here dnatext is expected to be iterable and your error suggests it's in fact None.

The cause is probably the bug in this function:

def openseq(filename):
    filename=input("enter file to open: ")
    openfile=open(filename,"r")
    dnatext=print(openfile.read())  # <-- this line
    return dnatext

The print function doesn't return anything (or returns None, depending on how you look at it). So this function will always return None.

Instead, you probably (but again, it's hard to say) want:

def openseq(filename):
    filename = input("enter file to open: ")
    with open(filename,"r") as openfile:
        dnatext = openfile.read()
    print(dnatext)
    return dnatext

which

  • uses a context manager to also close the file handle for you, and
  • prints and returns the data that was read (instead of just printing it)
jedwards
  • 29,432
  • 3
  • 65
  • 92
  • Hi @jedwards thank you! I am not confident that I even need the print(dnatext) line but now my issue is counting the bases as apparently 'dnatext' is not assigned as an actual variable containing the sequence data. I believe once I am able to assign the sequence to a variable I should be able to count the bases no problem – heather_l Mar 21 '22 at 22:59