1

Any idea about how to do it?

TesseractEngine engine = new TesseractEngine("./tessdata", "eng", EngineMode.Default);

Usually, for one language, just adding the abbreviation is enough. But how if I want to scan an image with multiple languages in it? Btw, I use the package by Charles Weld. Thanks.

Riiko
  • 35
  • 7

1 Answers1

3

According to here, the + syntax is supported, so you just need to add a + sign like the following:

TesseractEngine engine = new TesseractEngine("./tessdata", "jpn+eng", EngineMode.Default); // jpn+eng for Japanese and English

Also, according to here:

The output can be different based on the order of languages, so -l eng+hin can give different result than -l hin+eng.

From what I can see, the language you specify first has better accuracy.

Jesse Good
  • 50,901
  • 14
  • 124
  • 166
  • Btw, for the second thing you mentioned (The output can be different based on the order of languages, so -l eng+hin can give different result than -l hin+eng), how can I use it? – Riiko Nov 28 '21 at 08:15
  • @Riiko: `how can I use it?`. I'm not sure what you are asking. In my example, `"jpn+eng"` can give different results than `"eng+jpn"`. In other words, order matters. The documentation I quoted is from the command-line documentation. – Jesse Good Nov 28 '21 at 08:21
  • I see, then I don't need that. The first one works just fine, and I'm not using the command stuff. – Riiko Nov 28 '21 at 08:34