I have been testing my code a few times and it worked well every time, but now for some reason it raises a weird error that I will right down just after. I am using tabula to read some pdf file, here is the code where it appears there is an error :
for it_page,page in enumerate(pages_id, start=0):
print("page : ", page)
tables = tabula.read_pdf(hermes_pdf_dir + "/" + pdf_name, pages = page)
for i,table in enumerate(tables, start=1):
print( "titre retenu : " + pages_id_titres[it_page][1] + f"_{i}.xlsx")
table.to_excel(os.path.join(folder_name, pages_id_titres[it_page][1] + " p" + str(page) + f"_{i}.xlsx"), index=False)
The error is at the line beginning with "tables = tabula.read_pdf(...)".
Most importantly, here is the full error message :
Traceback (most recent call last):
File "get_pdfs_hermes.py", line 299, in <module>
read_pdf_download_csv(pdf_name2)
File "get_pdfs_hermes.py", line 199, in read_pdf_download_csv
tables = tabula.read_pdf(hermes_pdf_dir + "/" + pdf_name, pages = page)
File "C:\Users\virgi\Python\lib\site-packages\tabula\io.py", line 322, in read_pdf
output = _run(java_options, kwargs, path, encoding)
File "C:\Users\virgi\Python\lib\site-packages\tabula\io.py", line 80, in _run
result = subprocess.run(
File "C:\Users\virgi\Python\lib\subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['java', '-Dfile.encoding=UTF8', '-jar', 'C:\\Users\\virgi\\Python\\lib\\site-packages\\tabula\\tabula-1.0.4-jar-with-dependencies.jar', '--pages', '104', '--guess', '--format', 'JSON', 'C:\\Users\\virgi\\Desktop\\virgile_stuff\\prog\\banking analyst\\financial_data/data/hermes_data/hermes_2014_rapportannuel_en.pdf']' returned non-zero exit status 1.
It talks about java dependencies (maybe because tabula has tabula-py and tabula-java ?) and the most related issues I found regarding this kind of errors say that java should be updated, while I have the very latest version on my computer. Any ideas of what it could be ?