-1

I'm trying to fill values in a formula string by using Python Template Strings. The formula sometimes contains identifiers having non-ASCII characters like α, ß, Γ etc. (see Unicode Greek and Coptic Chart). But as per python documentation, the template string is limited to only ASCII identifiers. The default regular expression to match the identifier is (?a:[_a-z][_a-z0-9]*).

How can I extend the default regular expression so it also matches the characters from Unicode Greek and Coptic Chart?

  • 1
    Theoretically, you should be able to change `Template.idpattern` to accept non-ASCII identifiers. Practically, that does not seem to work. – DYZ Aug 30 '20 at 06:23
  • I checked it too. Probably because the pattern is compiled already during the instantiating process. I used the following method to solve my problem. ` `from string import Template as _Template` `class Template(_Template):` `idpattern = r'([_a-z\u00D8-\u00F6\u00F8-\u00FF\u0370-\u03FF][_a-z\u00D8-\u00F6\u00F8-\u00FF\u0370-\u03FF0-9]*)'` – Raza Cheema Aug 30 '20 at 13:37
  • But `\u00D8-\u00F6\u00F8-\u00FF` are not in the list you refer to in the question. Do you mean to match any Unicode letters? – Wiktor Stribiżew Aug 30 '20 at 13:44
  • @WiktorStribiżew The question is not about how to write a regex for identifiers, but how to make `Template` recognize it. – DYZ Aug 30 '20 at 15:57
  • @DYZ Sir for your help. – Raza Cheema Sep 01 '20 at 11:23

1 Answers1

1

This is how I was able to solve my problem.

from string import Template as _Template

    class Template(_Template):
        """Created a custom template class becasue default Template class doesn't support non ASCII identifiers"""
        idpattern = r'([_a-z\u0370-\u03FF][_a-z0-9\u0370-\u03FF]*)'

The subclass is required because python compiles a regex pattern during class initialization and the default Template.idpattern is fixed and have no effect if changed at a later stage.