why pattern
[A-Z][A-z]*
return Ve
for French word Vénus
using NSRegularExpression .I want to match camel word,but this word is strange
why pattern
[A-Z][A-z]*
return Ve
for French word Vénus
using NSRegularExpression .I want to match camel word,but this word is strange
The reason why your regex matches Ve
and not Vé
is because there are two ways to represent an é
in Unicode:
U+00E9
ore
, followed by the combining mark ´
(U+0065 U+0301
). Note that the latter is not the actual "standalone" ´
character (U+00B4
).Your string is apparently encoded using the second option. Therefore [A-z]
only matches the first half of the combined character. Since the following ´
doesn't match, the regex stops at this point. You should normalize the string first before applying a regex to it.
Furthermore, use [A-Za-z]
instead of [A-z]
. Otherwise, some non-letter characters like ^
or ]
will also be matched.