0

When I parse a PDF file with tabula-py in python, I get the following error

Exception in thread "main" java.lang.IllegalArgumentException: lines must be orthogonal, vertical and horizontal
        at technology.tabula.Ruling.intersectionPoint(Ruling.java:214)
        at technology.tabula.Ruling.findIntersections(Ruling.java:378)
        at technology.tabula.extractors.SpreadsheetExtractionAlgorithm.findCells(SpreadsheetExtractionAlgorithm.java:134)
        at technology.tabula.extractors.SpreadsheetExtractionAlgorithm.extract(SpreadsheetExtractionAlgorithm.java:63)
        at technology.tabula.extractors.SpreadsheetExtractionAlgorithm.extract(SpreadsheetExtractionAlgorithm.java:41)
        at technology.tabula.CommandLineApp$TableExtractor.extractTablesSpreadsheet(CommandLineApp.java:452)
        at technology.tabula.CommandLineApp$TableExtractor.extractTables(CommandLineApp.java:410)
        at technology.tabula.CommandLineApp.extractFile(CommandLineApp.java:180)
        at technology.tabula.CommandLineApp.extractFileTables(CommandLineApp.java:124)
        at technology.tabula.CommandLineApp.extractTables(CommandLineApp.java:106)
        at technology.tabula.CommandLineApp.main(CommandLineApp.java:76)

The version of java is 1.8.0.

$java -version
java version "1.8.0_291"
Java(TM) SE Runtime Environment (build 1.8.0_291-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.291-b10, mixed mode)

Could you please tell me the cause of the error and how to fix it?

atk
  • 3
  • 1
  • Thank you for comment. What could be the reason for the non-orthogonal rows being generated? – atk Aug 29 '23 at 08:09

0 Answers0