I am trying to extract key value pairs eg :- (securities-stock : 0.00) using pytesseract from image of personal finance statement. for eg
But so far I am not able to get that,
How should I rectify my approach so as to extract key value pair from the image?
So far I have been able to extract the text and its coordinates only
import pytesseract
import cv2
import numpy as np
from PIL import ImageTk, Image
pytesseract.pytesseract.tesseract_cmd = r'path to tesseract' #your path to tesseract engine
extracted = (pytesseract.image_to_data(Image.open('image.png'),lang='eng', output_type='data.frame'))
res = []
extracted = extracted.replace(r'^\s*$', np.nan, regex=True)
extracted = extracted.dropna()
image_data = extracted.to_numpy().tolist()
res.append({'image_data': image_data})
print(res)