34

Is there a way to precompile a regex in Perl? I have one that I use many times in a program and it does not change between uses.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Sam Lee
  • 9,913
  • 15
  • 48
  • 56
  • For the more general case of substitution, using variables containing the regular expressions and replacements (e.g., substitutions like `s/(\w+)/\u\L$1/g;` (in variables/external data), not just fixed strings in variables), see [bart's answer to *Passing a regex substitution as a variable in Perl*](https://stackoverflow.com/questions/125171/passing-a-regex-substitution-as-a-variable-in-perl/128321#128321) – Peter Mortensen Apr 28 '21 at 18:32

3 Answers3

72

For literal (static) regexes there's nothing to do -- Perl will only compile them once.

if ($var =~ /foo|bar/) {
    # ...
}

For regexes stored in variables you have a couple of options. You can use the qr// operator to build a regex object:

my $re = qr/foo|bar/;

if ($var =~ $re) {
    # ...
}

This is handy if you want to use a regex in multiple places or pass it to subroutines.

If the regex pattern is in a string, you can use the /o option to promise Perl that it will never change:

my $pattern = 'foo|bar';

if ($var =~ /$pattern/o) {
    # ...
}

It's usually better to not do that, though. Perl is smart enough to know that the variable hasn't changed and the regex doesn't need to be recompiled. Specifying /o is probably a premature micro-optimization. It's also a potential pitfall. If the variable has changed using /o would cause Perl to use the old regex anyway. That could lead to hard-to-diagnose bugs.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Michael Carman
  • 30,628
  • 10
  • 74
  • 122
  • 3
    These are true; however, qr// has been supported for many years now (it's existed since 5.005, and I think there's been no issues with it since 5.8) – ephemient Jun 04 '09 at 21:12
  • 10
    @ephemient Well, 5.10 has a nasty memory leak associated with qr// (and compiling regexes in general), but that has been fixed. If you are using 5.10, you can check to see if you have the memory leak by saying perl -e 'qr// while 1'. I know that the OS X version of ActiveState Perl 5.10 is still broken. – Chas. Owens Jun 04 '09 at 21:23
  • 4
    Note from 2016: the `/o` modifier has been deprecated. See [this question](http://stackoverflow.com/q/550258/477563) for details. – Mr. Llama Apr 27 '16 at 21:39
  • for multiple using the same precompiled regexp, you can use my $re = qr/foo|bar/ then next if ($var =~/something $re something/) many times. It is docummented in perlre – Znik Nov 16 '18 at 14:00
  • 1
    I would suggest always measuring speed results. Never trust "it should be faster with precompiled stuff" feeling. For me, using `qr//` instead of just using regexps inline in code made it slower by more than 60% in perl 5.24.1 ! – Matija Nalis Mar 16 '19 at 01:34
20

Use the qr// operator (documented in perlop - Perl operators and precedence under Regexp Quote-Like Operators).

my $regex = qr/foo\d/;
$string =~ $regex;
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
tsee
  • 5,034
  • 1
  • 19
  • 27
0

For clarification, you can use a precompiled regex as:

my $re = qr/foo|bar/;  # Precompile phase
if ( $string =~ $re ) ...   # For direct use
if ( $string =~ /$re/ ) .... # The same as above, but a bit complicated
if ( $string =~ m/something $re other/x ) ...  # For use precompiled as a part of a bigger regex
if ( $string =~ s/$re/replacement/ ) ...  # For direct use as replace
if ( $string =~ s/some $re other/replacement/x ) ... # For use precompiled as a part of bigger regex, and as replace all at once

It is documented in perlre, but there are no direct examples.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Znik
  • 1,047
  • 12
  • 17
  • 2
    I don't think "use precompiled as a part of bigger regex" is true. Regexes cannot be composed from strings. Eg consider what should happen if the next char after `$re` is `+` – Vladimir Alexiev Dec 19 '18 at 13:47
  • it is dependend on implementation and flags used for regexp. only doing some benchmark and tests could show us what compiler really do. when regexp is hardly compiled, then changing $re value will give no effect. There will be used old $re value. – Znik Apr 23 '19 at 11:33