Extracting text then saving to plain text file - TypeError: an integer is required (got type str)

Question

I am converting pdfs to text and got this code off a previous post:

Extracting text from a PDF file using PDFMiner in python?

When I print(text) it has done exactly what I want, but then I need to save this to a text file, which is when I get the above error.

The code follows exactly the first answer on the linked question. Then I:

text = convert_pdf_to_txt("GMCA ECON.pdf")

file = open('GMCAECON.txt', 'w', 'utf-8')
file.write(text)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-ebc6b7708d93> in <module>
----> 1 file = open('GMCAECON.txt', 'w', 'utf-8')
  2 file.write(text)

TypeError: an integer is required (got type str)

I'm afraid it's probably something really simple but I can't figure it out. I want it to write the text to a text file with the same name, which I can then do further analysis on. Thanks.

Please, edit your post to include full traceback you get. – buran Aug 28 '19 at 09:49 — buran, Aug 28 '19 at 09:49

score 2 · Accepted Answer · answered Aug 28 '19 at 09:56

2

The problem is your third argument. Third positional argument accepted by open is buffering, not encoding.

Call open like this:

open('GMCAECON.txt', 'w', encoding='utf-8')

and your problem should go away.

answered Aug 28 '19 at 09:56

matevzpoljanc

211
2
11

Thank you, that's great – Rachel9866 Aug 28 '19 at 09:59

score 1 · Answer 2 · answered Aug 28 '19 at 09:56

when you do file = open('GMCAECON.txt', 'w', 'utf-8') you pass positional arguments to open(). Third argument you pass is encoding, however the third argument it expect is buffering. You need to pass encoding as keyword argument, e.g. file = open('GMCAECON.txt', 'w', encoding='utf-8')

Note that it's much better is to use with context manager

with open('GMCAECON.txt', 'w', encoding='utf-8') as f:
    f.write(text)

Extracting text then saving to plain text file - TypeError: an integer is required (got type str)

2 Answers2