1

I have a string: prawy p pęknięty p zderzak pęknięcie (it's Polish language)

I want to select all p (except "p" in words "pęknięty" and "peknięcie")

I've tried to do something like that: \b(s*ps*)\b, but it doesn't work properly. Any ideas?

Wiola
  • 21
  • 2

2 Answers2

0

Maybe,

\bp(?=[a-z]+|\s|$)

or

(?!pęknięcie|pęknięty)\bp

might simply work fine.

Demo 1

Demo 2


If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Community
  • 1
  • 1
Emma
  • 27,428
  • 11
  • 44
  • 69
0

You might use a negative lookahead and a character class:

\bp(?!([eę]knię(?:cie|ty)\b)

In parts

  • \bp preceded by a word boundary
  • (?! If what is directly on the right is not
    • [eę]knię Match e or ę followed by knię
    • (?:cie|ty)\b Match cie or ty and a word boundary
  • ) Close negative lookahead

Regex demo

Using a character class might match an invalid variation of e or ę in the words.

To match the words exactly you could match them between word boundaries

\bp(?!ęknięty\b|ęknięcie\b)

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70