I am getting a blank tab when I try converting a PDF file to CSV using Tabula. I want to convert a specific page of the PDF to .csv format. I am getting the following error:
Got stderr: Oct 29, 2021 3:29:30 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider loadDiskCache
WARNING: New fonts found, font cache will be re-built
Oct 29, 2021 3:29:30 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
WARNING: Building on-disk font cache, this may take a while
Oct 29, 2021 3:29:30 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
WARNING: Finished building on-disk font cache, found 372 fonts
My code:
df = tabula.read_pdf('10iHP.pdf', pages = 'all')
tabula.convert_into("10iHP.pdf", "10iHP.csv", output_format="csv", pages='1')