9

I am trying to convert a PDF to plain text using the pdfminer.high_level.extract_text(). I keep getting this error message:

File "/Users/ian/Documents/Resume Selector Project/resumeBackend.py", line 5, in digestResume
    text = pdfminer.high_level.extract_text
AttributeError: module 'pdfminer' has no attribute 'high_level'

At first, I thought that this could be an issue with my module not being installed system wide, but I believe that I have eliminated that as a possible cause by running pdf2txt.py in the same directory where my project is located.

I will attach my code in order to ease the resolution of this issue.

import pdfminer
print(pdfminer.__version__)
res = '~/Documents/Personal/Employment/Resumes/Resume\ 11/03/2020'
def digestResume(resume): #resume is a pdf file (as str)
    text = pdfminer.high_level.extract_text(resume)
    print(text)
    
digestResume(res)
Ruli
  • 2,592
  • 12
  • 30
  • 40
iamianbrown
  • 91
  • 1
  • 1
  • 3

1 Answers1

18

In order to use pdfminer.high_level, you will need to run pip3 install pdfminer.six. Then in order to use the package in your code, you will need to add the line import pdfminer.high_level after your import pdfminer line. This is because Python does not automatically import subpackages by default.

Nathan Farlow
  • 346
  • 1
  • 5