I want to use this encoding for Tamil language text because it is more consistent with the languages nature, and Unicode encoding severely damages(read more here) the intrinsic features of the fusion of alphabets.
I want to use regex over this encoding. is it possible to do that with python regex module? or should I have to write my own FSM for this?