6

I am trying to improve the result by changing params using pytesseract config. I am wondering if there is a possibility to change load_system_dawg and load_freq_dawg as specified in https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality#page-segmentation-method since the words I am trying to get are not really english, but co-ords like XYZ ### and some other unique sequences of letters. See screenshot

minecraft screenshot

I can adjust the config of --psm but get error of no such command line argument or a file not existing if I try --load_system_dawg 0. I dunno, seemed like it was worth a shot...

params = r'--psm 11'
string = pytesseract.image_to_string(img, config = params)

I'm assuming there isn't a way to do this through python but if I could be directed as to how to change it I would appreciate it as I don't know much in the way of C++. Will this change be initialized through pytesseract? Additionally I have also tried changing user-patterns but not sure if this is the better way to go

Robin White
  • 159
  • 2
  • 11

1 Answers1

0

You need to know the followings:

For instance, if you apply threshold image will become:

enter image description here

Next apply bitwise_not:

enter image description here

Now if you read (Assuming image as a single uniform block of text.):

Hinecratt 1.14.4 1.14.4 / vanilla Javea: 136 51 64bit
68 fps (8 chunk updates) T: inf vsune fancy-clouds veo Hem: 4ah 8757 2648NE
Integrated server @ 11 ms ticks, 13 tx, 735 rx Allocated: 814% 1664M6
C: 1615376 (5) 0: 15, pC: G66, pu: 6, ab: Se
c afte oe CPU: 16% AND Fiyzen 7 L786 ECight-Core Processor
Client Chunk Cache: 1659, 75 Display: 1926%1880 CHVIOIA Corporation?
ServerChunkCache: S734 GeForce OTA 1666 606/PCle/55E2
ninecrattoverworld FC: a 4.6.4 HVIDIA 431.68
42: S6L641 / 11.66668 ยข 361.939 Targeted Block
Block: S61 11 361 Hinecrattiron_ore
Chunk: 13 11 3 in 18 @ 22
Facing: west (Towards negative 43 095.4 7 15.79 Targeted Fluid
Client Light: 11 (8 sky, 11 block? ninecrattempty
Server Light: (8 sky, 11 black?
CHS: 67 MH: 67
5H 3:67 0: 67 M: 67 ML: 67
Biome: minecrattdesert
Local Difficulty: 165 7/7 6.66 (Day 243
Looking at block: 295 16 361
Looking at liquid: 295 16 361
Sounds: 37247 + a7G
Debug: Pie Cehittl: hidden FPS + TPS Caltl: hidden
For hele: press Fa + oO

Code:

import cv2
import numpy as np
import pytesseract

# Load the image
img = cv2.imread("sPQDo1c.png")

# Convert to the HSV color-space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Threshold
thr = cv2.inRange(hsv, np.array([0, 0, 214]), np.array([179, 0, 225]))

# Bitwise-not
bnt = cv2.bitwise_not(thr)

# OCR
print(pytesseract.image_to_string(bnt, config="--psm 6"))

# Display
cv2.imshow("", bnt)
cv2.waitKey(0)

I'm using pytesseract version 0.3.7

Ahmet
  • 7,527
  • 3
  • 23
  • 47