1

I have String like below

tweet = "thank you guys, for coming my birthday @abcd @defg @hijk , and  @abcd don't forget your promises"

How to change that tweet to be

tweet = "thank you guys, for coming my birthday USERNAME_TWITTER_1 USERNAME_TWITTER_2 USERNAME_TWITTER_3 , and USERNAME_TWITTER_1 don't forget your promises"

`

PWS
  • 195
  • 3
  • 11

1 Answers1

2

You can use an id_dispatcher function:

from itertools import count

def id_dispatcher():
    return lambda c=count(1): next(c)

Then we can setup a defaultdictionary from the collections package:

from collections import defaultdict

dc = defaultdict(id_dispatcher())

and then use a regex replacement (see link for the construction of a Twitter username regex):

import re

re_user = re.compile(r'(?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9]+)')
outp = re_user.sub(lambda x : 'USERNAME_TWITTER_%s'%dc[x.group(0)],tweet)

This produces:

>>> re_user.sub(lambda x : 'USERNAME_TWITTER_%s'%dc[x.group(0)],tweet)
"thank you guys, for coming my birthday USERNAME_TWITTER_1 USERNAME_TWITTER_2 USERNAME_TWITTER_3 , and  USERNAME_TWITTER_1 don't forget your promises"
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • 1
    You can make your `id_dispatcher` - `from itertools import count; from collections import defaultdict; dc = defaultdict(lambda c=count(): next(c))`... – Jon Clements Jun 15 '17 at 10:56
  • Willem - still slightly over complicating that `id_dispatcher` - at this point it's more simply and understandably written as `def id_dispatcher(): yield from count()`... – Jon Clements Jun 15 '17 at 11:04
  • @JonClements: but that is not callable... Then the `next(..)` should be called over that. One can indeed use `id_dispatcher().__next__` but usually that is considered bad coding style. – Willem Van Onsem Jun 15 '17 at 11:08
  • @WillemVanOnsem oops good point - then we're back to my original comment anyway... :) – Jon Clements Jun 15 '17 at 11:11