4

I am new to opencv and need help in extracting text from a borderless table present in an image. Need to extract text from the below image. Input Image

I want to extract text and put the information in a data frame.

Expected output Expected Output

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36

1 Answers1

0

Extracting borderless tables using openCV alone is a bit of a challenge. However, you can use paddleocr to detect and OCR the table. Below is a code sample:

import cv2
import pandas as pd
from paddleocr import PPStructure

table_engine = PPStructure(recovery=True, return_ocr_result_in_table=True)


img_path = 'table_image.jpeg'
img = cv2.imread(img_path)
result = table_engine(img)

for line in result:
    line.pop('img')
    if line.get("type") == "table":
        html_table = line.get("res").get("html")
        html_data = pd.read_html(html_table)
        pd.DataFrame(html_data[0])
Luv_Python
  • 194
  • 6