0

CalledProcessError: Command '['java', '-Dfile.encoding=UTF8', '-jar', 'C:\Users\vijv2c13136\AppData\Local\Continuum\anaconda2\lib\site-packages\tabula\tabula-1.0.2-jar-with-dependencies.jar', '--pages', 'all', '--guess', '--format', 'JSON', 'TONY.pdf']' returned non-zero exit status 2

When I try to print the tables in the .pdf file. It shows this particular error.

from tabula import wrapper

print(wrapper.read_pdf("TONY.pdf", multiple_tables=True,pages="all")

This is my code for table extraction of .pdf file. But, it shows the above error when I am trying to print.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
A.Viji
  • 23
  • 2
  • 7

2 Answers2

2

One way to write the table in pandas dataframe and then save it. (even displayed it)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
fig.patch.set_visible(False)
ax.axis('off')
ax.axis('tight')

df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))

ax.table(cellText=df.values, colLabels=df.columns, loc='center')

fig.tight_layout()

plt.show()

plt.savefig("tablepdf.pdf", bbox_inches='tight')

enter image description here

DirtyBit
  • 16,613
  • 4
  • 34
  • 55
  • But, This code only prints the default column names in a .pdf file.But, I need all the tables in .pdf file, What can i do for multiple tables? – A.Viji Dec 13 '18 at 06:38
  • 1
    Since I do not see any code, I can't say anything you're doing, perhaps `a range base loop to check the number of tables and keep saving them to the pdf until the count of table is up` should do it. – DirtyBit Dec 13 '18 at 06:41
0

No real need to use dataframes, simply do:

import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.axis('off')

ax.table(cellText=[
                    ['row1', 'row1'],
                    ['row2', 'row2']
                  ],
         colLabels=['col1', 'col2'],
         loc='center')

fig.tight_layout()
plt.savefig("table.pdf", bbox_inches='tight')

Harsh Verma
  • 529
  • 6
  • 10