startswith TypeError in function

Question

Here is the code:

    def readFasta(filename):
        """ Reads a sequence in Fasta format """
        fp = open(filename, 'rb')
        header = ""
        seq = ""
        while True:
            line = fp.readline()
            if (line == ""):
                break
            if (line.startswith('>')):
                header = line[1:].strip()
            else:
                seq = fp.read().replace('\n','')
                seq = seq.replace('\r','')          # for windows
                break
        fp.close()
        return (header, seq)

    FASTAsequence = readFasta("MusChr01.fa")

The error I'm getting is:

TypeError: startswith first arg must be bytes or a tuple of bytes, not str

But the first argument to startswith is supposed to be a string according to the docs... so what is going on?

I'm assuming I'm using at least Python 3 since I'm using the latest version of LiClipse.

TerryA · Accepted Answer · 2013-11-07T03:54:15.147

78

It's because you're opening the file in bytes mode, and so you're calling bytes.startswith() and not str.startswith().

You need to do line.startswith(b'>'), which will make '>' a bytes literal.

edited Nov 07 '13 at 03:54

answered Nov 07 '13 at 03:45

TerryA

58,805
11
114
143

Ah I added b before all the strings and now it works. Thanks! – user2287873 Nov 07 '13 at 03:48
Hmm on a sidenote, seq = fp.read().replace(b'\n',b'') seems to be messing up the stuff that's read. Not sure what's going on but it only seems to be iterating twice (in a 190mb file) and outputting b' each time. – user2287873 Nov 07 '13 at 04:00
This is not backwards compatible. – Cerin May 03 '17 at 16:36
1

The problem is on first argument, not the second, so `line.startswith(b'>')` cannot possibly solve it. `bytes(line).startswith('>')`, on the other hand, could. – mpiskore May 14 '17 at 20:19
@mpiskore I used (based on this answer) `line.endswith(b'\n')` and I think it works well. – mirek Nov 01 '22 at 22:41

score 2 · Answer 2 · answered Jul 16 '19 at 14:45

2

If remaining to open a file in binary, replacing 'STR' to bytes('STR'.encode('utf-8')) works for me.

answered Jul 16 '19 at 14:45

wenching

51
4

score 0 · Answer 3 · answered Jan 29 '17 at 17:58

0

Without having your file to test on try encoding to utf-8 on the 'open'

fp = open(filename, 'r', encoding='utf-8')

answered Jan 29 '17 at 17:58

Andre Odendaal

759
7
7

startswith TypeError in function

3 Answers3

Linked