I'm trying to develop simple application (OpenCv, Tesseract and Java) where i need to get numbers from a photo of water meter. I am newbie to OpenCV and i am stuck on detection of numbers in rectangles.
So i want to achieve "00295" value as result.
Here is a example of water meter But i am not able to achieve this result.
Steps:
- Apply Gray filter
- GaussianBlur filter 3x3
- Sobel filter Threshold
- And doing OCR with number characters allowed only
But in result i get bunch of random numbers from other labels. Can you please give some suggestions and show way how to detect this 5 rectangles and get digits from them ? Thanks in advance.
Here is code:
private static final int
CV_THRESH_OTSU = 8;
public static void main(String[] args) {
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
Mat img = new Mat();
Mat imgGray = new Mat();
Mat imgGaussianBlur = new Mat();
Mat imgSobel = new Mat();
Mat imgThreshold = new Mat();
//Path to picture
String inputFilePath = "D:/OCR/test.jpg";
img = Imgcodecs.imread(inputFilePath);
Imgcodecs.imwrite("preprocess/1_True_Image.png", img);
Imgproc.cvtColor(img, imgGray, Imgproc.COLOR_BGR2GRAY);
Imgcodecs.imwrite("preprocess/2_imgGray.png", imgGray);
Imgproc.GaussianBlur(imgGray,imgGaussianBlur, new Size(3, 3),0);
Imgcodecs.imwrite("preprocess/3_imgGaussianBlur.png", imgGray);
Imgproc.Sobel(imgGaussianBlur, imgSobel, -1, 1, 0);
Imgcodecs.imwrite("preprocess/4_imgSobel.png", imgSobel);
Imgproc.threshold(imgSobel, imgThreshold, 0, 255, CV_THRESH_OTSU);
Imgcodecs.imwrite("preprocess/5_imgThreshold.png", imgThreshold);
File imageFile = new File("preprocess/5_imgThreshold.png");
Tesseract tesseract = new Tesseract();
//tessdata directory
tesseract.setDatapath("tessdata");
tesseract.setTessVariable("tessedit_char_whitelist", "0123456789");
try {
String result = tesseract.doOCR(imageFile);
System.out.println(result);
} catch (TesseractException e) {
System.err.println(e.getMessage());
}
}
}