Questions tagged [tess4j]

Tess4J is a Java JNA wrapper for Tesseract OCR API.

Description

A Java JNA wrapper for Tesseract OCR API.
Tess4J is released and distributed under the Apache License, v2.0.

Releases Versions.

  • Version 1.3 (released : May 31, 2014)
  • Version 2.0 Beta (released : June 1, 2014)
  • Version 3.4.3 (released: 14 January 2018)

Features:

The library provides optical character recognition (OCR) support for:

  • TIFF, JPEG, GIF, PNG, and BMP image formats
  • Multi-page TIFF images
  • PDF document format

Related tags

Links

Tess4J homepage
Tess4J Github

222 questions
2
votes
1 answer

Seven Segment Digital Data Recognition using Tessseract / Java

I am trying to recognize seven segment digital text from image using tess4J . My input is here I have made some normalization as follows 1 ] Image cropped . 2 ] Converted it into binary I wish to remove the jagged edges of text from image…
2
votes
1 answer

Next step in image preprocessing for OCR with Tesseract (tess4j)

I've been trying to use Tesseract to identify some digits in a series of images and after scouring for advice I've made a number of improvements. So far I've attempted the following steps: Binarize the image at an appropriate threshold to pick out…
Alex Pritchard
  • 4,260
  • 5
  • 33
  • 48
2
votes
1 answer

tess4j for linux UnsatisfiedLinkError

I am using tess4J api in order to deal with ocr process. I have successfully deployed my project on windows but i got stuck to run that project on linux ubunutu. According to my research i must have to use .so files instead of .dll files for linux.…
user2428568
  • 223
  • 1
  • 4
  • 18
2
votes
4 answers

NoSuchFieldError: RESOURCE_PREFIX with a maven project using tess4j

tess4j is an OCR packed with native library, I made a maven project to test it, I did add the installation path of maven to eclipse. I added M2_HOME, MAVEN_HOME and JAVA_HOME env variable, here is my parent pom
sliders_alpha
  • 2,276
  • 4
  • 33
  • 52
2
votes
1 answer

How to use user-words in Tesseract (Java)?

I am using Tesseract for OCR purposes and I have added few additional words into "fin.user-words" (I would like to avoid creating a new word list and replacing tessdata/fin.word-dawg with it). Now, I succeeded doing it in command prompt: >tesseract…
ABData
  • 23
  • 1
  • 5
2
votes
1 answer

deploy web application using tess4j in linux

i have to search in document stored in databases,among these documents is images,so i used tess4j to read this images. in windows with eclipse the project works fine with tess4j ,also if i deploy application in tomcat 6.35 in windows7 the projects…
JV_MI
  • 31
  • 1
  • 3
  • 7
2
votes
1 answer

Tess4j : java.lang.UnsatisfiedLinkError: Unable to load library

Im using tess4j.jar in my Eclipse project. When i run it on eclipse my project is working fine, but when i try to run the exported runnable .jar file it always fails due to "java.lang.UnsatisfiedLinkError: Unable to load library 'libtesseract302'"…
HelloWorld0815
  • 610
  • 4
  • 11
  • 29
2
votes
2 answers

How to point to eng.traineddata in Tess4J OCR project

I am new in Tess4J. I'm getting this error Error opening data file ./tessdata/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language…
Mazolo
  • 307
  • 4
  • 19
1
vote
0 answers

Improving accuracy in Tess4j - best options

I'm using Tess4j for text recognition from an image, but I'm having big problems with recognition accuracy. I've already done some tests processing the image with the openCV tools, but although it helped, the problem is still not solved. And I also…
Eugenio
  • 118
  • 1
  • 11
1
vote
0 answers

Problem using the Tess4J OCR, i try to work on a screenshot .jpeg

ITesseract instance = new Tesseract(); try { BufferedImage img = null; img = ImageIO.read(new File("C:\\Users\\nicol\\eclipse-workspace2\\Read\\images\\text.jpeg")); …
nicolajava
  • 11
  • 1
1
vote
0 answers

Tess4j - Searchable PDF

I am able to extract the text from images or extract the images out of a pdf and then to ocr to get the text - but I want to create a searchable pdf out of images or a pdf with images in it. It must somehow work with ITessAPI (TessAPI1 or…
ms88-aut
  • 375
  • 4
  • 13
1
vote
1 answer

Tess4j tesseract - How can you differentiate between columns or rows in a table?

I am working a bit with tess4j tesseract in Java. It works well and it allows me to do what I need. But I have come across an issue that I cannot solve without guidance or help. Let us say, I have the following image: This then provides me with the…
LanDanois
  • 65
  • 6
1
vote
0 answers

Could not initialize class net.sourceforge.tess4j.TessAPI on ec2

I have developed the java application for OCR with tess4j library, and it works well on my local machine(i'm using windows 10). And i have deployed it to EC2 instance which runs Readhat, and once the doOCR() invoked it returns following…
pl-jay
  • 970
  • 1
  • 16
  • 33
1
vote
1 answer

Tesseract failed loading language (Tess4j / Java / Netbeans)

I'm currently working on a program which should detect letters and numbers in an image using OpenCV and Tessj4. For that I downloaded and installed Tesseract (Version 5.0.0 alpha) from https://github.com/UB-Mannheim/tesseract/wiki, downloaded the…
Ypselon
  • 122
  • 8
1
vote
1 answer

It is possible to use the TessAPI1.TessPDFRendererCreate API of tess4J without needing to create physical files?

I am using the Tesseract Java API (tess4J) to convert Tiff images to PDFs. This works nicely, but I am forced to write both the source Tiff image and the output PDF to local filestore as actual physical files in order to use the…
Jon H
  • 394
  • 3
  • 17