I need to extract tabular data from pdfs. Some tables in the pdf comprise of only a single row. I have been trying to extract the data using camelot library.
Code for extraction using Camelot:
pip install camelot-py[cv] tabula-py here
import camelot
file = 'xyz.pdf'
tables = camelot.read_pdf(file,pages ="all")
tables[6].df
The above code is not able to extract a single row table info.
For instance, in the pdf: https://www.nirfindia.org/nirfpdfcdn/2022/pdf/Engineering/IR-E-U-0306.pdf, the tool is not able to detect the last table(under the heading Faculty Details) as it consists of only one row.
Can someone suggest a workaround?