7

I looked and searched and couldn't find what I needed although I think it should be simple (if you have any Python experience, which I don't).

Given a string, I want to verify, in Python, that it contains ONLY alphanumeric characters: a-zA-Z0-9 and . _ -

examples:

Accepted:

bill-gates

Steve_Jobs

Micro.soft

Rejected:

Bill gates -- no spaces allowed

me@host.com -- @ is not alphanumeric

I'm trying to use:

if re.match("^[a-zA-Z0-9_.-]+$", username) == True:

But that doesn't seem to do the job...

Warlax
  • 2,459
  • 5
  • 30
  • 41
  • 3
    re.match() doesn't return a boolean, it returns a [MatchObject](http://docs.python.org/library/re.html#re.match), which "always has a boolean value True", or None. – mg. Mar 25 '10 at 21:58
  • It is always bad to use `== True`. It is at best redundant and in a case like this, just does not work. – Mike Graham Mar 25 '10 at 22:27
  • 2
    Do you really consider (for example) `---.___` to be a valid match? – John Machin Mar 26 '10 at 08:24

6 Answers6

21

re.match does not return a boolean; it returns a MatchObject on a match, or None on a non-match.

>>> re.match("^[a-zA-Z0-9_.-]+$", "hello")
<_sre.SRE_Match object at 0xb7600250>
>>> re.match("^[a-zA-Z0-9_.-]+$", "    ")
>>> print re.match("^[a-zA-Z0-9_.-]+$", "    ")
None

So, you shouldn't do re.match(...) == True; rather, you should be checking re.match(...) is not None in this case, which can be further shortened to just if re.match(...).

Mark Rushakoff
  • 249,864
  • 45
  • 407
  • 398
5

Never use == True or == False in a comparison. Many types already have a bool equivalent which you should use instead:

if re.match("^[a-zA-Z0-9_.-]+$", username):
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
2

Could also shorten it slightly to :

if re.match(r'^[\w.-]+$', username):
2

I would consider this for a valid username:
1) Username must be 6-30 characters long
2) Username may only contain:

  • Uppercase and lowercase letters
  • Numbers from 0-9 and
  • Special characters _ - .

3) Username may not:

  • Begin or finish with characters _ - .

  • Have more than one sequential character _ - . inside

This would be example of usage:
if re.match(r'^(?![-._])(?!.*[_.-]{2})[\w.-]{6,30}(?<![-._])$',username) is not None:

sp_omer
  • 291
  • 4
  • 12
1

If you are going to use many regular expressions you can compile it for speed (or readability)

import re 
ALPHANUM=re.compile('^[a-zA-Z0-9_.-]+$')

for u in users:
    if ALPHANUM.match(u) is None:
        print "invalid"

From the docs:

The compiled versions of the most recent patterns passed to re.match(), re.search() or re.compile() are cached, so programs that use only a few regular expressions at a time needn’t worry about compiling regular expressions.

Dragon Dave
  • 637
  • 6
  • 15
fabrizioM
  • 46,639
  • 15
  • 102
  • 119
0

I do my validation this way in my utils class:

def valid_re(self, s, r):
 reg = re.compile(r)
 return reg.match(s)

Then I call the utils instance, and check this way:

if not utils.valid_re(username, r'^[a-zA-Z0-9_.-]+$'):
        error = "Invalid username!"
Daniel Watson
  • 463
  • 1
  • 4
  • 9