Could you provide a regex that match Twitter usernames?
Extra bonus if a Python example is provided.
Could you provide a regex that match Twitter usernames?
Extra bonus if a Python example is provided.
(?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9-_]+)
I've used this as it disregards emails.
Here is a sample tweet:
@Hello how are @you doing @my_friend, email @000 me @ whats.up@example.com @shahmirj
Matches:
It will also work for hashtags, I use the same expression with the @
changed to #
.
If you're talking about the @username
thing they use on twitter, then you can use this:
import re
twitter_username_re = re.compile(r'@([A-Za-z0-9_]+)')
To make every instance an HTML link, you could do something like this:
my_html_str = twitter_username_re.sub(lambda m: '<a href="http://twitter.com/%s">%s</a>' % (m.group(1), m.group(0)), my_tweet)
The regex I use, and that have been tested in multiple contexts :
/(^|[^@\w])@(\w{1,15})\b/
This is the cleanest way I've found to test and replace Twitter username in strings.
#!/usr/bin/python
import re
text = "@RayFranco is answering to @jjconti, this is a real '@username83' but this is an@email.com, and this is a @probablyfaketwitterusername";
ftext = re.sub( r'(^|[^@\w])@(\w{1,15})\b', '\\1<a href="http://twitter.com/\\2">\\2</a>', text )
print ftext;
This will return me as expected :
<a href="http://twitter.com/RayFranco">RayFranco</a> is answering to <a href="http://twitter.com/jjconti">jjconti</a>, this is a real '<a href="http://twitter.com/username83">username83</a>' but this is an@email.com, and this is a @probablyfaketwitterusername
Based on Twitter specs :
Your username cannot be longer than 15 characters. Your real name can be longer (20 characters), but usernames are kept shorter for the sake of ease. A username can only contain alphanumeric characters (letters A-Z, numbers 0-9) with the exception of underscores, as noted above. Check to make sure your desired username doesn't contain any symbols, dashes, or spaces.
Twitter recently released to open source in various languages including Java, Ruby (gem) and Javascript implementations of the code they use for finding user names, hash tags, lists and urls.
It is very regular expression oriented.
This is a method I have used in a project that takes the text attribute of a tweet object and returns the text with both the hashtags and user_mentions linked to their appropriate pages on twitter, complying with the most recent twitter display guidelines
def link_tweet(tweet):
"""
This method takes the text attribute from a tweet object and returns it with
user_mentions and hashtags linked
"""
tweet = re.sub(r'(\A|\s)@(\w+)', r'\1@<a href="http://www.twitter.com/\2">\2</a>', str(tweet))
return re.sub(r'(\A|\s)#(\w+)', r'\1#<a href="http://search.twitter.com/search?q=%23\2">\2</a>', str(tweet))
Once you call this method you can pass in the param my_tweet[x].text. Hope this is helpful.
The only characters accepted in the form are A-Z, 0-9, and underscore. Usernames are not case-sensitive, though, so you could use r'@(?i)[a-z0-9_]+'
to match everything correctly and also discern between users.
This regex seems to solve Twitter usernames:
^@[A-Za-z0-9_]{1,15}$
Max 15 characters, allows underscores directly after the @, (which Twitter does), and allows all underscores (which, after a quick search, I found that Twitter apparently also does). Excludes email addresses.
Shorter, /@([\w]+)/
works fine.
I have used the existing answers and modified it for my use case. (username must be longer then 4 characters)
^[A-z0-9_]{5,15}$
Rules:
Source: https://help.twitter.com/en/managing-your-account/twitter-username-rules
In case you need to match all the handle
, @handle
and twitter.com/handle
formats, this is a variation:
import re
match = re.search(r'^(?:.*twitter\.com/|@?)(\w{1,15})(?:$|/.*$)', text)
handle = match.group(1)
Explanation, examples and working regex here: https://regex101.com/r/7KbhqA/3
Matched
myhandle
@myhandle
@my_handle_2
twitter.com/myhandle
https://twitter.com/myhandle
https://twitter.com/myhandle/randomstuff
Not matched
mysuperhandleistoolong
@mysuperhandleistoolong
https://twitter.com/mysuperhandleistoolong
You can use the following regex: ^@[A-Za-z0-9_]{1,15}$
In python:
import re
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
pattern.match('@Your_handle')
This will check if the string exactly matches the regex.
In a 'practical' setting, you could use it as follows:
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
if pattern.match('@Your_handle'):
print('Match')
else:
print('No Match')