camelot in python doesn't recognize all tables

Question

I use camelot in python for table extraction from pdf file. I have code as follows:

tables=camelot.read_pdf(r'file_to_path'
                        ,flavor='lattice',pages='1'
                        ,shift_text=['']
                        )

The problem is camelot doesn't recognize all tables. I run this code to debug issue "visually"

camelot.plot(tables[0],kind='contour').show()

and got output like this. It's clear the fourth table was not recognized. I assume that's because of different shape, I mean without columns in table only rows.

Is there any way to handle this issue?

I'm trying now to figure out how table area works. Think gonna use _bbox property of all parsed tables and find if there is any space between them and pass this space to table_area while reading pdf — data_b77, Feb 03 '22 at 16:44

score 0 · Answer 1 · answered Feb 03 '22 at 17:23

0

For me worked line_scale=40 as additional property while reading pdf

answered Feb 03 '22 at 17:23

data_b77

415
6
19

This resolved my issue :) – data_b77 Feb 03 '22 at 17:23

camelot in python doesn't recognize all tables

1 Answers1