0

I would like to keep just words that start with '@' and continue with letters or dots. Basically I have done opposite that I can match such a words but don't know how to match everything besides this match. So basically just keep those that starts with '@'. So far I have this patter:

(@[a-zA-Z0-9.]+\b)

I tried to use '?!' but it doesn't work. Thanks!

John Snow
  • 153
  • 7
  • so `if` you have a match, `replace`, otherwise *don't do anything*... let regex as is and do the "negation" in python... – Adelin Jan 15 '18 at 14:05
  • How? I got str.replace from pandas , where I should but negation? Thanks – John Snow Jan 15 '18 at 14:07
  • Don't even match & negate & replace, instead match & join (the set of `@words`) – Aaron Jan 15 '18 at 14:07
  • Example : https://ideone.com/WqvV5H (note that I used `\w` for simplicity but it's not equivalent to your class. The trailing `\b` however can be safely removed, as well as the enclosing capturing group) – Aaron Jan 15 '18 at 14:13
  • Is there any way how to make this negation in regex or using pandas replace? I know about function extractall but the result is multiindex frame and str.extract extracts just first occurance – John Snow Jan 15 '18 at 14:20
  • That pattern matches anything containing an `@`, e.g. for `"a @b c@d"` it would match `@b` *and* `@d`. Is that what you want? – mata Jan 15 '18 at 14:21
  • @mata well, you are right, this is not what I want. If I change it to ^(@[a-zA-Z0-9.]+\b) then it catches just first line. Any idea how to catch it anywhere? – John Snow Jan 15 '18 at 14:25
  • 1
    what about `(?:^|\s)(?:(?!@).)*` ? are the words define by something between whitespace – Nahuel Fouilleul Jan 15 '18 at 14:29
  • Something like `re.findall(r"(?:\A|\s)(@[a-zA-Z0-9.]+\b)", "@a @b @c@d e@f")` seems to do it, but also catches `@c` since `@` is a word boundary – mata Jan 15 '18 at 14:30
  • thanks a lot! Nahuel's solutions works best to me! – John Snow Jan 15 '18 at 14:37
  • @JohnSnow, It's not exactly the contrary, it will leave `@other_characters` for example it is the contrary of `@\S+` – Nahuel Fouilleul Jan 15 '18 at 14:45

2 Answers2

0

From the comments, the following regex is ok

(?:^|\s)[^@]*

exact contrary would be

(?:^|[^@A-Za-z0-9.]|@(?![A-Za-z0-9.]+\b))[^@]*
Nahuel Fouilleul
  • 18,726
  • 2
  • 31
  • 36
0

Try with this Regex Expression:

(@+[a-zA-Z0-9.]+[a-zA-Z0-9]+)

I tested it on-line and it does what you are looking for ( match every words that starts with @ and it can continue with dots, es: @hello.sir match | @hello.sir.do match and so on.. )