0

In django I get the file uploaded by the user with input_pdf = request.FILES['pdf'] and I want to extract fiel text with pdftextract library with pdf = XPdf(input_pdf) but it gives an error: TypeError: _getfullpathname: path should be string, bytes or os.PathLike, not InMemoryUploadedFile. How should I get the path of the user uploaded file or how can I use pdftextract with the data type InMemoryUploadedFile.
I must say that for local files pdftextract extract text with the following code:

from pdftextract import XPdf
file_path = "examples/pubmed_example.pdf"
pdf = XPdf(file_path)
txt = pdf.to_text()
print(txt)
Meysam
  • 105
  • 1
  • 1
  • 6
  • Your actual code that is giving the error would be much more relevant than code that works perfectly. – David Sep 08 '21 at 19:04

1 Answers1

0

It looks like you are trying to decode a PDF file object, but XPdf expects a file path. You should save your file to a path on disk (you could open a file path as write and read your uploaded file to it) and then call XPdf on the path.

David
  • 1,688
  • 1
  • 11
  • 21