0

The following code returns regex patterns matching English characters.

For matching only lowercase characters

LocaleData.getExemplarSet(ULocale.forLanguageTag("en-US"), 0).toPattern(true),
LocaleData.getExemplarSet(ULocale.forLanguageTag("en-US"), 0).toPattern(false),

For matching both upper case and lowercase characters

LocaleData.getExemplarSet(ULocale.forLanguageTag("en-US), UnicodeSet.CASE).toPattern(true)
LocaleData.getExemplarSet(ULocale.forLanguageTag("en-US), UnicodeSet.CASE).toPattern(false)

And here's the output both all together:

[a-z]
[a-z]
[A-Za-zſK]
[A-Za-z\u017F\u212A]

Why the latter (case insensitive) includes also ſ (long s) and (Kelvin sign)?

dzieciou
  • 4,049
  • 8
  • 41
  • 85

0 Answers0