7

Lots of ready-to-use character classes are available in Perl regular expressions, such as \d or \S, or new-fangled Unicode grokkers such as \p{P}, which matches punctuation characters.

Now let's say I'd like to match all punctuation characters \p{P} (quite a number of them, and not something you want to type in by hand) - all but one, all but the good old komma (or comma, ,).

Is there a way to specify this requirement short of expanding the handy character class and taking away the komma by hand?

Lumi
  • 14,775
  • 8
  • 59
  • 92
  • Found a very similar question, well, basically the same question: [How to match any non white space character except a particular one in Perl?](http://stackoverflow.com/a/6125137/269126) – Lumi Dec 14 '11 at 13:00

2 Answers2

9
$ unichars -au '\p{P}' | wc -l
598

Double negation:

/[^\P{P},]/

$ unichars -au '[^\P{P},]' | wc -l
597

"And" through lookahead/lookbehind:

/\p{P}(?<!,)/

$ unichars -au '\p{P}(?<!,)' | wc -l
597

unichars

ikegami
  • 367,544
  • 15
  • 269
  • 518
7

Try this

[^\P{P},]

This is a negated character class, that matches all but the listed characters.

\P{P} negated \p{P}

stema
  • 90,351
  • 20
  • 107
  • 135