-4

i have a string for example "Free Visual Studio developer offers offers offers offers with everything you need to create apps" what i want to do is "offers" is occuring 4 times sequentially; i want to keep "offers" just once and remove all the other occurances. This is just an example string,i have a dataset where such cases are there of same words occuring sequentially more than once, Please help me a way to remove other such words and keep one word and generate a resulting string

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Shubham Ringne
  • 403
  • 2
  • 6
  • 11
  • a) split your string on whitespace to produce a list of words. b) Apply the solution in the duplicate question. c) re-join the resulting list with a space again. – Martijn Pieters Aug 22 '16 at 08:43

1 Answers1

0

You can use regular expressions to find repeating words and remove the extra ones:

>>> import re
>>> s = 'Free Visual Studio developer offers offers offers offers with everything you need to create apps'
>>> re.sub(r'\b(\w+\s)(\1+)', '\\1', s)
'Free Visual Studio developer offers with everything you need to create apps'
poke
  • 369,085
  • 72
  • 557
  • 602