0

I am working on project ,where i converting scanned image to text, i am using python pytesseract.image_to_string, everything working fine, i get text file, but problem is text file not preserved multiple space between column.It is taking only one space between the columns, I am using python2.7, ubuntu 14.04.

my code url = "/home/raghu/raghu2.png" img = Image.open(url)

text = pytesseract.image_to_string(img, lang='eng') print text

Is there any configuration so that i can put in above line.

raghu
  • 384
  • 7
  • 10
  • Try this https://stackoverflow.com/questions/49810566/python-pytesseract-extracts-incorrect-text-from-image/51964830#51964830 -- Good Luck -- – Sunku Vamsi Tharun Kumar Aug 22 '18 at 10:39
  • 2
    Thanks, i got the solution. just put as config='-c preserve_interword_spaces=1' in tessaract will work. – raghu Aug 23 '18 at 06:00
  • Another similar link (for preserving completeness) is https://stackoverflow.com/questions/51668339/preserving-spaces-in-tesseract thanks – Sundaresh Jan 16 '19 at 16:55

0 Answers0