7

Technical question:

Given a regex:

my $regEx = qr{whatever$myVar}oxi; # Notice /o for "compile-once"

What is the most effective way to force it to recompile on demand? (e.g. when I know from the program logic that $myVar value changed) without dropping /o and depending on Perl's internal smarts to auto-recompile?

NOTE: The regex is used in a substitution, which may affect re-compilation rules sans /o:

$string2 =~ s/$regEx//;

The context is:

  • I have a regular expression that is built by slurping in a fairly long (>1k long) string from a config file.

    • That file is re-read once every 60 minutes.

    • If the string read from the file changes (as defined by changing file timestamp), I want to re-compile the regex using the re-slurped string value in $myVar.

  • The regex is used repeatedly and frequently in the Perl module running under mod_perl.

    • This means that (coupled with the string being >1-2k long) I must use "/o" modifier to force compile-once on the regex, to avoid performance hit of Perl repeatedly checking if the variable value changed (this heuristic is from perlop qr//, since the regex is used as part of s/// as shown above and not by itself as a match).

    • That in turn means that, when I know that the variable changed after re-slurping it in 1 hour, I need to force the regex to re-compile despite the /o modifier.

UPDATE: Here's an illustration of why I need /o - without it, the regex is recompiled (and thus necessarily checked) every loop iteration; with it it is NOT:

$ perl -e '{for (my $i=0; $i<3; $i++) {
                 my $re = qr{$i}oix; $s="123"; $s =~ s/$re//; 
                 print "i=$i; s=$s\n"; }}'
i=0; s=123
i=1; s=123
i=2; s=123

$ perl -e '{ for (my $i=0; $i<3; $i++) { 
                  my $re = qr{$i}ix; $s="123"; $s =~ s/$re//; 
                  print "i=$i; s=$s\n"; }}'
i=0; s=123
i=1; s=23
i=2; s=13
DVK
  • 126,886
  • 32
  • 213
  • 327

3 Answers3

4
when I know from the program logic that $myVar value changed

m//, s/// and qr// only compile if the pattern doesn't change. All you have to do to get the behaviour you requested is to remove the /o.

$ perl -Mre=debug -e'
    qr/$_/ for qw( abc abc def def abc abc );
' 2>&1 | grep Compiling
Compiling REx "abc"
Compiling REx "def"
Compiling REx "abc"

Therefore,

If the string read from the file changes (as defined by changing file timestamp), I want to re-compile the regex using the re-slurped string value in $myVar.
my $new_myVar = ...;
if ($myVar ne $new_myVar) {
   $re = qr/$new_myVar/;
   $myVar = $new_myVar;
}
...
s/$re/.../

or just

$myVar = ...;
...
s/$myVar/.../
ikegami
  • 367,544
  • 15
  • 269
  • 518
3

You basically answered your own question. Use qr{...} to create a compiled regexp object and then use it:

my $re = qr{...};

...

if ($str =~ $re) {
   # this used the statically compiled object
}

...

if ($time_to_recompile) {
    $re = qr{...};
}

You do not even need the "/o" modifier.

Nemo
  • 70,042
  • 10
  • 116
  • 153
  • I don't think that works. The second block in your code is executed AL LOT, and to the best of my understanding it would try and detect whether the value of interpolated variable changed EACH TIME. – DVK Jun 01 '11 at 18:45
  • @Nemo - I see what the problem is now. I neglected to mention that the regex in my code is NOT used in pure `$str =~ $re` way, but in a substitution: `$str =~ s/$re//`. Unless I am mis-understanding the docs, this means the regex WILL be recompiled every substitution without `/o`, even if the code example you gave would not. I updated the Q with the example usage. – DVK Jun 01 '11 at 18:53
  • I do not think so. But it is easy enough to try an experiment... Set `$re = qr{...}` where `...` depends on some string `$s`, then modify `$s`, then do your `s/$re//`. If it uses the original string, then it is a good bet it got compiled. – Nemo Jun 01 '11 at 18:56
  • 1
    Worst case, just use `/o`, as in `my $re = qr{...}/o`. When you want to recompile, just assign `$re = qr{...}/o`. – Nemo Jun 01 '11 at 18:59
  • @DVK: I think you are over-estimating the abilities of Perl. Try this: `perl -e 'my $myVar = "x13"; my $qr = qr{some text $myVar and more}; $myVar = "zz22"; print "$qr\n";'`, which gives me `(?-xism:some text x13 and more)`. The compiled regex has the value of the variable at the time it was compiled - with no 'o' modifiers in sight. – Jonathan Leffler Jun 01 '11 at 19:26
  • @Jonathan - my tests show opposite for `s///`: `perl -e '{for (my $i=0; $i<4; $i++) { my $re = qr{$i}ix; $s="1234"; $s =~ s/$re//; print "i=$i; s=$s\n"; }}'` -- produces `1234; 234; 134; 124`. – DVK Jun 01 '11 at 19:43
  • 1
    @DVK: Of course it does -- you're explicitly recompiling the regex each time through the loop. – Michael Carman Jun 01 '11 at 19:56
  • @Michael. Exactly. @DVK: Move the `my $re = qr{...}` out of the loop; that is the whole point. With the regexp object generated by `qr{}`, you can control exactly when it gets compiled or recompiled. If you were using something like `s/abc$re//`, you might have to worry... But for `s/$re//`, I am sure Perl will _not_ recompile the regexp every time. – Nemo Jun 01 '11 at 23:00
3

According to perlop

The effect the 'o' modifier has is not propagated, being restricted to those patterns explicitly using it.

So if you write

my $str = 'x';
my $re  = qr/$str/o;
...
if (s/$re//) {
    ...
}

Perl will still check to see whether or not $re has changed when executing the s///. The /o acts as a promise that the value of $str used in the compilation of $re won't change so if you re-executed the qr// you'd get the same result even if $str has changed. You can see this in effect with use re 'debug':

use strict;
use warnings;
use re 'debug';

foreach my $i (0 .. 2) {
    my $s  = '123';

    print STDERR "Setting \$re\n";
    my $re = qr/$i/o;

    print STDERR "Performing s///\n";
    $s =~ s/$re//; 
}

With the /o modifier, you'll only see "Compiling REx..." after "Setting $re" the first time through the loop. Without it you'll see it each iteration.

The take-away is that if you want to change the pattern during runtime you shouldn't use /o. It won't affect the s/// and it will prevent you from being able to re-compile $re when you need to.

Michael Carman
  • 30,628
  • 10
  • 74
  • 122