I want to parse html documents for links to twitter profiles using a regex and preg_match_all() in PHP. The twitter links are in this form:
http(s)://twitter.com/#!/twitter_name
I only want to grab links that are purely to the profile page ( eg. nothing after the twitter_name ).
I would like to handle both http and https ( because this is common in these links ).
I would also like to handle //www.twitter.com and //twitter.com ( also common ).
How should I structure my regex?