0

If I do the following:

ustr = unicode()
ustr = '青皮'    
print len(ustr)

I get an output of 6. But that's the number of bytes.

How do I get an output of 2? (i.e. the actual number of unicode code points)

patchwork
  • 1,161
  • 3
  • 8
  • 23
  • @Bhargav Rao - i don't think either of those 2 answers given to that question, will actually answer mine. – patchwork Jun 07 '16 at 15:27
  • The dupe perfectly answers your post. Try `len(__import__('unicodedata').normalize('NFC',u'青皮'))`. – Bhargav Rao Jun 07 '16 at 15:42
  • the answer seems to be to create the unicode object as: `ustr = unicode(<"青皮">, 'utf-8'). And then `len(ustr)` gives me the number of code points. – patchwork Jun 07 '16 at 15:53

0 Answers0