I am trying to match a single character (specifically an abbreviation for clothing size) in a Python string which can take several forms. Here are some examples of the data that I am working with:
'Youth Demo Short Sunset Orange S(10)'
'Youth Demo Short Sunset Orange S-10'
'Youth Demo Short Sunset Orange S/10'
'Youth Demo Short Sunset Orange S'
'S-Works Shiv S'
I would like to build a regex pattern that will look for either an isolated size or a size followed a non-word character and a digit and return the size. In the above data, matching the fourth example is easy with \b(S)\b
, but that alone is not enough to keep regex from matching the S
in S-Works
in the fifth string. I was thinking something like the following for the first three strings:
\bS(?=\W\d)
, but that will miss the last two strings.