-1

I have a list which contains english alphabets, Hindi alphabets, Greek Symbols and digits as well. I want to remove all alphabets except that of Hindi. Hindi alphabets range in unicode is u'0900'-u'097F'. For details about Hindi alphabets visit http://jrgraphix.net/r/Unicode/0900-097F.

Input:

l=['ग','1ए','==क','@','ऊं','abc123','η','θ','abcशि']

for i in l:
    print i

Desired Output:

ग
ए
क
ऊं
शि
Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
Ishpreet
  • 5,230
  • 2
  • 19
  • 35

1 Answers1

5

To get a character value you can use the ord(char) buildin function.

In your case, something like this should works:

strings = [u'ग',u'1ए',u'==क',u'@',u'ऊं',u'abc123',u'η',u'θ',u'abcशि']
for string in strings:
    for char in string:
        if ord(u'\u0900') <= ord(char) <= ord(u'\u097F'):
            print(char)

The ord(char) function is available for both Python 2 and Python 3

https://docs.python.org/2.7/howto/unicode.html

Ceppo93
  • 1,026
  • 10
  • 11
  • @ishpreet I've tested with python 2.7, what do you mean with _'It's not working'_, the output is wrong? did you get some error? – Ceppo93 Jul 19 '16 at 10:27
  • According to 'Google' odr()->Return a string of one character whose ASCII code is the integer. While i want unicode character given that i have a unicode value. And also Your code is not printing the unicode characters like 'ग' even though it is present in the unicode range – Ishpreet Jul 19 '16 at 10:42
  • My bad, I've forgot to update the answer, you must pass explicit unicode string, like `u'ग'`, or it doesn't works – Ceppo93 Jul 19 '16 at 10:57
  • Not all of them, but that's because of some encoding issues, even my editor encode some of them badly, (e.g. the last one) – Ceppo93 Jul 19 '16 at 11:33