I'm looking for a regex to match numeral pinyin lexical unit (one or more pinyin without space).
Reading Regex for Matching Pinyin seems a good start as I was able to quickly add the support for numeral by doing :
/(ORIGINAL_REGEXP)[0-5]/
So essentially wrapping the old regexp in a group and appending the numeral condition. However I'm not able to extend this to the case of multiple words. For instance :
jiao4zuo4zhi1wu4 叫座之物
jiao4zu3 教祖
jiao4zong1xuan3ju3 教宗选举
jiao4zi3 教子
jiao4zhun3yi2qi4 校准仪器
jiao4zhun3tiao2 校准条
jiao4zhun3ti1chi3 校准梯尺
jiao4zhun3quan1 校准圈
jiao4zhun3qi4 校准器
jiao4zhun3pu3 校准谱
N.B.: This expression will be used in a Javascript context.