Get rid of your $caseSensitive
parameter, as it will be useless in many cases. Instead, users of that function can encode the necessary information directly in the $validationRe
regex.
When you create a regex object like qr/foo/
, then the pattern is at that point compiled into instructions for the regex engine. If you stringify a regex object, you'll get a string that when interpolated back into a regex will have exactly the same behaviour as the original regex object. Most importantly, this means that all flags provided or omitted from the regex object literal will be preserved and can't be overridden! This is by design, so that a regex object will continue to behave identical no matter what context it is used in.
That's a bit dry, so let's use an example. Here is a match
function that tries to apply a couple similar regexes to a list of strings. Which one will match?
use strict;
use warnings;
use feature 'say';
# This sub takes a string to match on, a regex, and a case insensitive marker.
# The regex will be recompiled to anchor at the start and end of the string.
sub match {
my ($str, $re, $i) = @_;
return $str =~ /\A$re\z/i if $i;
return $str =~ /\A$re\z/;
}
my @words = qw/foo FOO foO/;
my $real_regex = qr/foo/;
my $fake_regex = 'foo';
for my $re ($fake_regex, $real_regex) {
for my $i (0, 1) {
for my $word (@words) {
my $match = 0+ match($word, $re, $i);
my $output = qq("$word" =~ /$re/);
$output .= "i" if $i;
say "$output\t-->" . uc($match ? "match" : "fail");
}
}
}
Output:
"foo" =~ /foo/ -->MATCH
"FOO" =~ /foo/ -->FAIL
"foO" =~ /foo/ -->FAIL
"foo" =~ /foo/i -->MATCH
"FOO" =~ /foo/i -->MATCH
"foO" =~ /foo/i -->MATCH
"foo" =~ /(?^:foo)/ -->MATCH
"FOO" =~ /(?^:foo)/ -->FAIL
"foO" =~ /(?^:foo)/ -->FAIL
"foo" =~ /(?^:foo)/i -->MATCH
"FOO" =~ /(?^:foo)/i -->FAIL
"foO" =~ /(?^:foo)/i -->FAIL
First, we should notice that the string representation of regex objects has this weird (?^:...)
form. In a non-capturing group (?: ... )
, modifiers for the pattern inside the group can be added or removed between the question mark and colon, while the ^
indicates the default set of flags.
Now when we look at the fake regex that's actually just a string being interpolated, we can see that the addition of the /i
flag makes a difference as expected. But when we use a real regex object, it doesn't change anything: The outside /i
cannot override the (?^: ... )
flags.
It is probably best to assume that all regexes already are regex objects and should not be interfered with. If you load the regex patterns from a file, you should require the regexes to use the (?: ... )
syntax to apply flages (e.g. (?^i:foo)
as an equivalent to qr/foo/i
). E.g. loading one regex per line from a file handle could look like:
my @regexes;
while (<$fh>) {
chomp;
push @regexes, qr/$_/; # will die here on regex syntax errors
}