What you want is very difficult.
If I make a mail server called this.is.my.email.my-domain.com
, and an account called martin
, my perfectly valid US email would be martin@this.is.my.email.my-domain.com
. Emails with more than 1 domain part are not uncommon (.gov
is a common example).
Disallowing emails from the .uk
TLD is also problematic, since many US-based people might have a .uk
address, for example they think it sounds nice, work for a UK based company, have a UK spouse, used to live in the UK and never changed emails, etc.
If you only want US-based registrations, your options are:
Ask your users if they are US-based, and tell them your service is only for US-based users if they answer with a non-US country.
Ask for a US address or phone number. Although this can be faked, it's not easy to get a matching address & ZIP code, for example.
Use GeoIP, and allow only US email addresses. This is not fool-proof, since people can use your service on holidays and such.
In the question's comments, you said:
Does it not make sense that if some one has a .jp TLD, or .co.uk, it stands to reason (with considerable accuracy) that they are internationally based?
Usually, yes. But far from always. My girlfriend has 4 .uk email addresses, and she doesn't live in the UK anymore :-) This is where you have to make a business choice, you can either:
- Turn away potential customers
- Take more effort in allowing customers with slightly "strange" email addresses
Your business, your choice ;-)
So, with that preamble, if you must do this, this is how you could do it:
import re
EMAIL_REGEX = re.compile(r'''
^ # Anchor to the start of the string
[^@]+ # Username
@ # Literal @
([^@.]+){1} # One domain part
\. # Literal 1
([^@.]+){1} # One domain part (the TLD)
$ # Anchor to the end of the string
''', re.VERBOSE)
print(EMAIL_REGEX.search('test@example.com'))
print(EMAIL_REGEX.search('test@example.co.uk'))
Of course, this still allows you to register with a .nl
address, for example. If you want to allow only a certain set of TLD's, then use:
allow_tlds = ['com', 'net'] # ... Probably more
result = EMAIL_REGEX.search('test@example.com')
if result is None or result.groups()[1] in allowed_tlds:
print('Not allowed')
However, if you're going to create a whilelist, then you don't need the regexp anymore, since not using it will allow US people with multi-domain addresses to sign up (such as @nlm.nih.gov
).