0

What I am trying to do here is convert a pdf to a text file. This txt is not a pre-existing one, but it is created with creaty. The problem is that although writy.write() has worked fine in other scripts, it won't do anything to change the writy file now, so it remains blank. What should I change? thanks

p.s. the encoding in both open exists because there is u\u0152 within the result

import PyPDF2

pdfFileObj = open('Computer_science_paper_1__HL.pdf', 'rb')
creaty = open('Computer_science_paper_1__HL.txt', 'w+', encoding="utf-8")
writy = open('Computer_science_paper_1__HL.txt', 'a', encoding="utf-8")

pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

for x in range(1,pdfReader.numPages):
    pageObj = pdfReader.getPage(x)
    result = pageObj.extractText()

    writy.write(result)
lofihelsinki
  • 2,491
  • 2
  • 23
  • 35
  • Have you tried printing the `result` variable to standard output, so you can verify that `PyPDF2` actually finds something when you call `extractText()` – Niels B. Jan 02 '18 at 15:10
  • yes, the actual code contains `print (result)` just before the last line; I omitted it here because I thought it was trivial. It does give the correct output though – Nepheli Kardassi Jan 02 '18 at 15:14
  • 2
    when do you close writy? the file is buffered, it won't actually be written until either a `flush` or a `close` is called. if the data is big enough, I'd expect some output, but if the buffers are big enough you might not get anything. it's safer to do `with open('file', 'a') as writy: `... – Corley Brigman Jan 02 '18 at 15:33
  • umm, I never closed it. Now it works, thanks a lot :) – Nepheli Kardassi Jan 02 '18 at 15:36

0 Answers0