-1

I have a html file with text encoded in a non-unicode font. I need to convert that file to unicode. I searched for a convertor. But, most of the convertors work for only a list of fonts, not for all fonts.

My font is very specific, text is in Devanagari script. I have the file, I have the font, now, please suggest me a tool or technique. Thanks.

user625118
  • 29
  • 1
  • 6

2 Answers2

5

Unicode is not about fonts, it is about encoding. You need to find a converter that can convert your text to Unicode. What is the encoding of your text?

Nemanja Trifunovic
  • 24,346
  • 3
  • 50
  • 88
1

Apache Tika has the ability to pull text from PDF files via knowledge of font behavior. So if the file is in fact a PDF you have a chance. If you have a text file full of font indices in no particular encoding, you have a big programming job ahead of you.

bmargulies
  • 97,814
  • 39
  • 186
  • 310