0

I'm new to python and I'm getting this error when trying to execute the following code which aims to take the contents of this pdf and put it in an excel document. My os is Windows 10 and I'm using VS code via Anaconda3. I'm not sure what I'm doing wrong. Thank you all in advance.

FileNotFoundError: [WinError 2] The system cannot find the file specified

import tabula
file_path = (r"C:\Users\shattv\anaconda3\envs\venv1\TestInvoice.pdf")
oup = (r"C:\Users\shattv\anaconda3\envs\venv1\test.xlsx")
df = tabula.read_pdf(file_path,pages="all")
df.to_excel (oup)

enter image description hereenter image description hereenter image description here

I tried checking os.getcwd and got the same file path:C:\Users\shattv\anaconda3\envs\venv1>. Below are screenshots of the excel and pdf files. I also tried changing to a backslash and still got this error.

C:/Users/shattv/anaconda3/envs/venv1/TestInvoice.pdf"

enter image description hereenter image description here

shattv
  • 13
  • 2
  • you might want to try removing the `r` in front of the double quote and use 2 backslashes instead of only one – Gugu72 Aug 07 '23 at 20:13
  • @Gugu72 Why would that be different? The whole point of raw strings is so you don't have to double the backslashes. – Barmar Aug 07 '23 at 20:47
  • **I also tried changing to a backslash** Don't you mean forward slash? This is backslash: ```\ ```, this is forward slash: `/` – Barmar Aug 07 '23 at 20:48
  • I know what I meant yeah, sometimes r-strings are not working for me while double-BACKSLASH are – Gugu72 Aug 09 '23 at 07:17

1 Answers1

0

Try this:

  1. remove r tag in front of the file.

    file_path = ("C:/Users/user/anaconda3/envs/venv1/TestInvoice.pdf")

These should work. If the above two do not work try this.

import os.path
file_path = ("C:/Users/user/anaconda3/envs/venv1/TestInvoice.pdf")
isFile = os.path.isfile(file_path)
print(is_file)

If this prints False, then Python can not locate file, and then follow this tutorial. If it prints True try installing Java and putting it in PATH. Tabula is a simple Python wrapper of tabula-java, which can read tables in a PDF and then change there format. Since it is a wrapper of Java you should install have these two things:

  1. Java 8+
  2. Python 3.8+

Once you have both it should work. If not I do not know how to fix that.

Hansolo414
  • 16
  • 4
  • It printed true, so seems like the issue is with the Java install. Thank you for your help! – shattv Aug 08 '23 at 12:49