When doing the OCR of a dictionary pdf using DocumentAI, some IPA characters are often included, i.e. ʷ
ə
etc. Is there a way to recognize them correctly, such as setting a certain language hint? Currently ʷ
is recognized as w
and ə
as a
.
Asked
Active
Viewed 38 times
0

jonah_w
- 972
- 5
- 11
1 Answers
1
Document AI only detects IPA characters that are in a supported language.
However, this could be a useful feature, so I made a Public Issue Tracker for this feature request. https://issuetracker.google.com/287464641

Holt Skinner
- 1,692
- 1
- 8
- 21