Your punctuation problems are (1) you need to escape the .
with a backslash when you want to match the actual decimal point, otherwise it matches any character; and (2) you need to escape the double-quote or otherwise prevent it from terminating your string.
The best way to write this as a readable debuggable regex is to use a Python "raw" string r"like this"
which allows backslashes without escaping, and furthermore to triple-quote it, which lets you to use both '
and "
inside it without escaping. And since triple-quoted strings allow multi-line expressions, you could even compile in VERBOSE
mode, allowing whitespace and comments. Debuggability of your subsequent matching/extraction code is also improved if you use the (?P<...>)
named-group syntax in your regex—groups will then be accessible by meaningful names, in the match object's groupdict()
output. Taken all together, that gives us:
PATTERNS = [ # a list of alternative acceptable formats
re.compile( r"""
^\s* # beginning of string (optional whitespace)
(?P<degrees>\d+)[\s] # integer number of degrees (NB: might be desirable to insert the degree symbol into the square brackets here, to allow that as a possibility?)
(?P<minutes>\d+)' # integer number of minutes
(?P<seconds>\d+(\.\d*)?)" # seconds, with optional decimal point and decimal places
(?P<axis>[NE]?) # optional 'N' or 'E' character (remove '?' to make it compulsory)
\s*$ # end of string (optional whitespace)
""", re.VERBOSE ),
re.compile( r"""
^\s* # beginning of string (optional whitespace)
(?P<degrees>\d+)[\s] # integer number of degrees (NB: might be desirable to insert the degree symbol into the square brackets here, to allow that as a possibility?)
(?P<minutes>\d+(\.\d*)?) # minutes, with optional decimal point and decimal places
(?P<axis>[NE]?) # optional 'N' or 'E' character (remove this line if this is never appropriate in this format)
\s*$ # end of string (optional whitespace)
""", re.VERBOSE ),
]