3

I am trying to convert a pdf document to text document using pdftotext software.

I need to call this application inc command prompt from python script to convert the file.

I have following code:

import os 
import subprocess

path = "C:\\Users\\..." 
pdffname = "pdffilename.pdf" 
txtfname = "txtfilename.txt"

subprocess.call(['pdftotext', '-layout', 
     os.path.join(path, pdffname),
     os.path.join(path, txtfname)])

When I run this code, I get error

  File "C:/Users/.../code-1.py", line 44, in <module>
    os.path.join(path, txtfname)])

  File "C:\Anaconda\lib\subprocess.py", line 522, in call
    return Popen(*popenargs, **kwargs).wait()

  File "C:\Anaconda\lib\subprocess.py", line 710, in __init__
    errread, errwrite)

  File "C:\Anaconda\lib\subprocess.py", line 958, in _execute_child
    startupinfo)

WindowsError: [Error 2] The system cannot find the file specified

Can you help to call pdftotext application from python to convert pdf to text file.

jfs
  • 399,953
  • 195
  • 994
  • 1,670

1 Answers1

1

I had this same error, except with Popen. I fixed it by providing the full path to pdftotext.exe in the subprocess call. Don't forget to escape your backslashes.

I do not know much about Anaconda, and I have not tested this myself, but I believe Conda may have an issue referencing scripts on Windows: fix references to scripts on windows

astrimbu
  • 61
  • 3
  • 9