1

I need to use regex to filter a string, the string will contain at least one dot, and surrounded by a limited charset,

So I used (Ignored all spaces):

^[a-z0-9:_-]+ \. [a-z0-9:_-]+$

The problem is that I need to use the exact same regex [a-z0-9:_-]+ for twice. Is there a way to write a better one?

hjpotter92
  • 78,589
  • 36
  • 144
  • 183
daisy
  • 22,498
  • 29
  • 129
  • 265

5 Answers5

2

No, you must explicitly repeat the charset regex before and after the fixed point.

caskey
  • 12,305
  • 2
  • 26
  • 27
2

if case doesn't matter, depending on the language you're using, you can probably get away with this

^[\w:-]+ \. [\w:-]+$

\w matches [A-Za-z0-9_]


An alternative would be to build the RegExp from strings. Here's a JavaScript example

var chars = '[\\w:-]';
var re    = new RegExp('^' + chars + ' \\. ' + chars + '$');

re;
// => /^[\w:-] \. [\w:-]$/ 

This contrived example doesn't save you much, but depending on how complex your regexen can get, this could save you from having to duplicate your character classes. Also, don't forget to \\ escape your slashes when building a regexp with a string.


If I was writing a parser or something, I would probably take the above example one step further and do something like this:

RegExp.build = function(regexen, flags) {
  return new RegExp(regexen.map(function(re) { return re.source }).join(''), flags);
};

var chars = /[\w:-]+/;

RegExp.build([/^/, chars, / \. /, chars, /$/], 'gi');

//=> /^[\w:-]+ \. [\w:-]+$/gi
Mulan
  • 129,518
  • 31
  • 228
  • 259
2

I don't know if lua supports this syntax, (It works in perl so may be with PCRE):

^([a-z0-9:_-]+)\.(?1)$

(?1) is the same patern as the one used to capture group 1 (ie. [a-z0-9:_-]+).

Toto
  • 89,455
  • 62
  • 89
  • 125
  • +1 Didn't know PCRE supported this. Now I'm jealous that it's not supported in JavaScript : – Mulan Sep 16 '13 at 14:04
0

Some languages allow storing regular expressions in a variable, or building them from strings. For example, in Perl you can do:

my $re_l = qr/[a-z0-9:_-]+/;
my $re   = qr/^$re_l\.$re_l$/;
perreal
  • 94,503
  • 21
  • 155
  • 181
0

POSITIVE LOOKAHEAD
/^(?=.*[^.]\.[^.])[a-z0-9:_.-]+$/ - at least one dot surrounded by nondot characters
/^(?=^([^.]+\.)+[^.]+$)[a-z0-9:_.-]+$/ - at least one dot and every dot is surrounded by nondot characters

Leonid
  • 3,121
  • 24
  • 31