1

For my application I am using the tabula package to convert the pdf to csv. The cloud function I have written is in python3.7. I have written it in requirements.txt file. But I am getting this error

File "/layers/google.python.pip/pip/lib/python3.7/site-packages/tabula/io.py", line 91, in _run raise JavaNotFoundError(JAVA_NOT_FOUND_ERROR) tabula.errors.JavaNotFoundError: `java` command is not found from this Python process.Please ensure Java is installed and PATH is set for `java`

requirements file

tabula-py==1.4.1

main.py

import tabula
df = tabula.read_pdf('/tmp/' + file_id +'.pdf', pages = required_page)[0]
tabula.convert_into('/tmp/' + file_id +'.pdf', '/tmp/' + file_id +'.csv',output_format="csv",pages=required_page, stream=False)

How do I resolve this? Is any alternative for this?

Jasmine
  • 476
  • 3
  • 22

1 Answers1

2

The error is expected since Java is not included in the runtime image used by Google Cloud Functions Python3.7.

This means that is not possible to use this library within a Python Cloud Function as Java does not come in the included system packages.

However as an alternative, you could use Google Cloud Run which is a more modern product that covers the same functionality that Cloud Functions. You can use this Quickstart to deploy your first service and then install Java within your Dockerfile or use another Docker image that comes with Java installed.

llompalles
  • 3,072
  • 11
  • 20