I am new to opencv and need help in extracting text from a borderless table present in an image. Need to extract text from the below image.
I want to extract text and put the information in a data frame.
I am new to opencv and need help in extracting text from a borderless table present in an image. Need to extract text from the below image.
I want to extract text and put the information in a data frame.
Extracting borderless tables using openCV alone is a bit of a challenge. However, you can use paddleocr to detect and OCR the table. Below is a code sample:
import cv2
import pandas as pd
from paddleocr import PPStructure
table_engine = PPStructure(recovery=True, return_ocr_result_in_table=True)
img_path = 'table_image.jpeg'
img = cv2.imread(img_path)
result = table_engine(img)
for line in result:
line.pop('img')
if line.get("type") == "table":
html_table = line.get("res").get("html")
html_data = pd.read_html(html_table)
pd.DataFrame(html_data[0])