1

While parsing a pdf file using pypdf2, it reads the hifenated words like mm-dd-yy in a newline as :

mm

-

dd

-

yy

This is my code:

import PyPDF2    
def getPDFContent(path):
    pdf = PyPDF2.PdfFileReader(file(path, "rb"))    
    content = ""
    content += pdf.getPage(0).extractText() + "\n"    
    return content

How can I overcome this and print them in the same line ?

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
sri vignes
  • 163
  • 1
  • 2
  • 8

0 Answers0