2

Say I have a pinyin:

gēge

How could I get the "tone number" of the accented character? eg,in this case, ē would be first tone, ideal output would be ge1ge. But really, first step is just how would I convert the tone into a number?

Example input / output:

gēge
nǎinai
wàipó

BECOMES

ge1ge
na3inai
wa4ipo2

I would like to do this ideally in python, but im flexible.

Thanks! :)

Wboy
  • 2,452
  • 2
  • 24
  • 45

2 Answers2

6

When expressed in normal form D (*) (decomposition), the four pinyin tone use the following combining (unicode) signs:

  • COMBINING MACRON ('\u0304') for tone 1
  • COMBINING ACUTE ACCENT ('\u0301') for tone 2
  • COMBINING CARON ('\u030c') for tone 3
  • COMBINING GRAVE ACCENT ('\u0300') for tone 4

That means that automatic processing in Python is almost trivial: you normalize your (unicode) string into its normal form D and replace the above combining characters with their digit value

Code could be:

def to_tone_number(s):
    table = {0x304: ord('1'), 0x301: ord('2'), 0x30c: ord('3'),
         0x300: ord('4')}
    return unicodedata.normalize('NFD', s).translate(table)

You can then use:

>>> print(to_tone_number('''gēge
nǎinai
wàipó'''))
ge1ge
na3inai
wa4ipo2

in Python 3, or in Python 2:

>>> print(to_tone_number(u'''g\u0113ge
n\u01ceinai
w\xe0ip\xf3'''))
ge1ge
na3inai
wa4ipo2

(*) Refs:

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
0

Use regular expressions. There is a useful regex command:

re.findall() 

You could use it first to identify all accented characters, and afterwords replace them with the string replace method,

str.replace('ē','e3') 

for example

Gustav Rasmussen
  • 3,720
  • 4
  • 23
  • 53
  • I wish it was that simple. There are many combinations of accented characters, and i cant be typing every single toned character to replace it with the normal version of it now can i? – Wboy Mar 17 '17 at 10:20
  • There are 21 initials and 6 finals. Each with 4 tones thats 108 manual encodings @abccd – Wboy Mar 17 '17 at 10:50
  • 1
    There's no better way, you can download a module off pypi but that's what they do as well – Taku Mar 17 '17 at 12:30