14

Why are the two printed numbers different?

#!/usr/bin/env perl
use warnings;
use 5.10.1;

my $sep = '';
my $number = 110110110110111;

$number =~ s/(\d)(?=(?:\d{3})+\b)/$1$sep/g;
say "A: <$number>";

$number =~ s/\Q$sep\E//g;
say "B: <$number>";

Output:

A: <110110110110111>
B: <11111111111>
sid_com
  • 24,137
  • 26
  • 96
  • 187

1 Answers1

16

Quote from man perlop:

If the pattern evaluates to the empty string, the last successfully executed regular expression is used instead.

Try to insert one successful regex match before the second substitution to see what’s going on:

(my $foo = '1') =~ s/1/x/; # successfully match “1”
$number =~ s///g;          # now you’re deleting all 1s
say "B: <$number>";        # <0000>

I’d say this should be deprecated and warned about by use warnings. It’s hard to see the benefits.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
zoul
  • 102,279
  • 44
  • 260
  • 354
  • 7
    Note that this shows that \Q\E and interpolation are string operations, not part of the regex proper, since they are all resolved before the "pattern evaluates to the empty string" test. If you really want to prevent this misfeature and your regex consists of just interpolated bits that may all be empty, throw in a `(?#)` which has no effect on matching but makes the pattern non-empty. – ysth Oct 29 '12 at 10:57
  • Re *"I’d say this should be deprecated"*: Indeed. I ran into this and wasted a whole day. There was a stray `s///g;` (template used during editing) and the result was some seemingly bizarre behaviour, like a test-only (read-only) construct, reduced to the dummy construct `if (/\-\d\d/) {}`, affecting the output. There were about 700 lines between the two "interacting" lines. It was tracked down by reducing the input data size and then the script size (removing lines that didn't affect the bug). I only found this Stack Overflow question after the fact. (Perl v5.30.0 on Linux (Debian deriv.)) – Peter Mortensen Apr 20 '21 at 23:31