What's a better way to write this regex, that make sure the target string contains at least one dot?

Question

I need to use regex to filter a string, the string will contain at least one dot, and surrounded by a limited charset,

So I used (Ignored all spaces):

^[a-z0-9:_-]+ \. [a-z0-9:_-]+$

The problem is that I need to use the exact same regex [a-z0-9:_-]+ for twice. Is there a way to write a better one?

See http://stackoverflow.com/questions/6775383/can-i-define-custom-character-class-shorthands — Sam Dufel, Sep 16 '13 at 02:41
In Ruby, you could write `e = '[a-z0-9:_-]+'; "ab_- . g1:" =~ /^#{e} \. #{e}$/ # => 0`. — Cary Swoveland, Sep 16 '13 at 03:27
Lua's pattern matching functions (`string.match`, etc.) do not use regular expressions. They use a similar but different syntax. — Colonel Thirty Two, Sep 16 '13 at 13:21

score 2 · Answer 1 · answered Sep 16 '13 at 02:40

2

No, you must explicitly repeat the charset regex before and after the fixed point.

answered Sep 16 '13 at 02:40

caskey

12,305
2
26
27

Mulan · Answer 2 · 2013-09-16T08:28:27.800

if case doesn't matter, depending on the language you're using, you can probably get away with this

^[\w:-]+ \. [\w:-]+$

\w matches [A-Za-z0-9_]

An alternative would be to build the RegExp from strings. Here's a JavaScript example

var chars = '[\\w:-]';
var re    = new RegExp('^' + chars + ' \\. ' + chars + '$');

re;
// => /^[\w:-] \. [\w:-]$/

This contrived example doesn't save you much, but depending on how complex your regexen can get, this could save you from having to duplicate your character classes. Also, don't forget to \\ escape your slashes when building a regexp with a string.

If I was writing a parser or something, I would probably take the above example one step further and do something like this:

RegExp.build = function(regexen, flags) {
  return new RegExp(regexen.map(function(re) { return re.source }).join(''), flags);
};

var chars = /[\w:-]+/;

RegExp.build([/^/, chars, / \. /, chars, /$/], 'gi');

//=> /^[\w:-]+ \. [\w:-]+$/gi

score 2 · Answer 3 · answered Sep 16 '13 at 09:28

2

I don't know if lua supports this syntax, (It works in perl so may be with PCRE):

^([a-z0-9:_-]+)\.(?1)$

(?1) is the same patern as the one used to capture group 1 (ie. [a-z0-9:_-]+).

answered Sep 16 '13 at 09:28

Toto

89,455
62
89
125

+1 Didn't know PCRE supported this. Now I'm jealous that it's not supported in JavaScript : – Mulan Sep 16 '13 at 14:04

score 0 · Answer 4 · answered Sep 16 '13 at 02:45

0

Some languages allow storing regular expressions in a variable, or building them from strings. For example, in Perl you can do:

my $re_l = qr/[a-z0-9:_-]+/;
my $re   = qr/^$re_l\.$re_l$/;

answered Sep 16 '13 at 02:45

perreal

94,503
21
155
181

score 0 · Answer 5 · answered Sep 16 '13 at 02:59

0

POSITIVE LOOKAHEAD
/^(?=.*[^.]\.[^.])[a-z0-9:_.-]+$/ - at least one dot surrounded by nondot characters
/^(?=^([^.]+\.)+[^.]+$)[a-z0-9:_.-]+$/ - at least one dot and every dot is surrounded by nondot characters

answered Sep 16 '13 at 02:59

Leonid

3,121
24
31

What's a better way to write this regex, that make sure the target string contains at least one dot?

5 Answers5