I tried to convert pdf document (includes tables) into csv file. Unfortunately I failed. I have used the following approaches:
Used
pdfminer
first converted the pdf to text but structure of text file was not same as of pdf file .Used
pypdf2
first converted the pdf to text but structure of text file was not same as of pdf file.Used
pdftotext
first converted the pdf to text but structure of text file was not same as of pdf file.Used
slate
first converted the pdf to text but structure of text file was not same as of pdf file.
Kindly tell me the appropriate way to convert pdf to csv file. Some people have recommended me to parse the document to xml file and then to csv file. Even then I did not got the solution.
The PDF document looks as follows:
Are there any better tools which can convert pdf document (includes complex tables) to csv file?
Solutions in Python language would be highly appreciated.