1

Given a character (one letter of a string), how could I identify to which language it belongs ? The options are: English, Russian, Hebrew.

Background: this character was entered by user in a form and then stored in a database.

It can be for example the first letter in one of these words:

  • Hello
  • Привет
  • שלום
Misha Moroshko
  • 166,356
  • 226
  • 505
  • 746

1 Answers1

3

The UNICODE standard is divided into "blocks". Go here:

http://www.unicode.org/charts/

http://en.wikipedia.org/wiki/Unicode_block

http://www.unicode.org/versions/Unicode6.0.0/

and find unicode blocks (intervals) for each language.

My guess:

So for you its the matter of simple number comparsion for each character (unicode ordinal value). Very simple.

lzap
  • 16,417
  • 12
  • 71
  • 108