1

I would like to tell the difference between a number 1 and string '1'.

The reason that I want to do this is because I want to determine the number of capturing parentheses in a regular expression after a successful match. According the perlop doc, a list (1) is returned when there are no capturing groups in the pattern. So if I get a successful match and a list (1) then I cannot tell if the pattern has no parens or it has one paren and it matched a '1'. I can resolve that ambiguity if there is a difference between number 1 and string '1'.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
dividebyzero
  • 1,243
  • 2
  • 9
  • 17
  • 1
    Please clarify exactly what it is you're trying to do, and what the problem is. As it is, your question doesn't make a lot of sense. Perhaps a code example ... – Brian Roach Nov 11 '11 at 19:13
  • Your question doesn't make sense... The only way there could be an ambiguity is where both the used regex and the using value matched against are unknown, which strikes me as a bit odd... – pavel Nov 11 '11 at 19:16
  • I am trying a count the number of capturing groups in a regular expression containing alternatives. See my other question [link](http://stackoverflow.com/questions/8069006/how-can-i-tell-which-of-the-alternatives-matched-in-a-perl-regular-expression-pa). With something like qr/($re1)|($re2)|($re3)/, I will able to tell which re matched if I know the number of capturing groups in each regexps. But as I found out, there is no easy way to do this (Perl doesn't expose its compiled regexp to programmers). So my idea is that once a sub regexp matched, I can use the number of captures to tell. – dividebyzero Nov 11 '11 at 19:31
  • Check http://stackoverflow.com/questions/12647/how-do-i-tell-if-a-variable-has-a-numeric-value-in-perl – Matteo Nov 11 '11 at 19:32
  • @dividebyzero: ever tried `print`ing the result of `qr//`? – pavel Nov 11 '11 at 19:39
  • @Matteo: Scalar::Util::looks_like_number() will return a true value for both `1` and `'1'` because they both *look* like numbers. (Curiously, it seems to return a *different* true value, but I can't find any documentation about what those values mean.) – Michael Carman Nov 11 '11 at 19:47
  • You're probably better off just using a loop and searching for `$re1`, `$re2`, etc. separately. Then there won't be any confusion about which regex matched. – cjm Nov 11 '11 at 19:50
  • @MichaelCarman: you are right. In fact '1' can be used as number. '1'+1 is valid and will return 2. – Matteo Nov 11 '11 at 20:10

6 Answers6

7

You can tell how many capturing groups are in the last successful match by using the special @+ array. $#+ is the number of capturing groups. If that's 0, then there were no capturing parentheses.

cjm
  • 61,471
  • 9
  • 126
  • 175
2

For example, bitwise operators behave differently for strings and integers:

~1 = 18446744073709551614

~'1' = Î ('1' = 0x31, ~'1' = ~0x31 = 0xce = 'Î')

#!/usr/bin/perl

($b) = ('1' =~ /(1)/);
print isstring($b) ? "string\n" : "int\n";
($b) = ('1' =~ /1/);
print isstring($b) ? "string\n" : "int\n";

sub isstring() {
    return ($_[0] & ~$_[0]);
}

isstring returns either 0 (as a result of numeric bitwise op) which is false, or "\0" (as a result of bitwise string ops, set perldoc perlop) which is true as it is a non-empty string.

  • Although this will distinguish between 1 and '1', it's not how you should try to count capture groups in a regex. – brian d foy Nov 11 '11 at 21:15
  • Hey Brian :) Yes, I was just about to add a note on that the number of capture groups is known even before running a regexp. –  Nov 11 '11 at 21:20
1

If you want to know the number of capture groups a regex matched, just count them. Don't look at the values they return, which appears to be your problem:

You can get the count by looking at the result of the list assignment, which returns the number of items on the right hand side of the list assignment:

my $count = my @array = $string =~ m/.../g;

If you don't need to keep the capture buffers, assign to an empty list:

my $count = () = $string =~ m/.../g;

Or do it in two steps:

my @array = $string =~ m/.../g;
my $count = @array;

You can also use the @+ or @- variables, using some of the tricks I show in the first pages of Mastering Perl. These arrays have the starting and ending positions of each of the capture buffers. The values in index 0 apply to the entire pattern, the values in index 1 are for $1, and so on. The last index, then, is the total number of capture buffers. See perlvar.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
0

Perl converts between strings and numbers automatically as needed. Internally, it tracks the values separately. You can use Devel::Peek to see this in action:

use Devel::Peek;
$x = 1;
$y = '1';
Dump($x);
Dump($y);

The output is:

SV = IV(0x3073f40) at 0x3073f44
  REFCNT = 1
  FLAGS = (IOK,pIOK)
  IV = 1
SV = PV(0x30698cc) at 0x3073484
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x3079bb4 "1"\0
  CUR = 1
  LEN = 4

Note that the dump of $x has a value for the IV slot, while the dump of $y doesn't but does have a value in the PV slot. Also note that simply using the values in a different context can trigger stringification or nummification and populate the other slots. e.g. if you did $x . '' or $y + 0 before peeking at the value, you'd get this:

SV = PVIV(0x2b30b74) at 0x3073f44
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 1
  PV = 0x3079c5c "1"\0
  CUR = 1
  LEN = 4

At which point 1 and '1' are no longer distinguishable at all.

Michael Carman
  • 30,628
  • 10
  • 74
  • 122
0

Check for the definedness of $1 after a successful match. The logic goes like this:

  • If the list is empty then the pattern match failed
  • Else if $1 is defined then the list contains all the catpured substrings
  • Else the match was successful, but there were no captures
Borodin
  • 126,100
  • 9
  • 70
  • 144
-1

Your question doesn't make a lot of sense, but it appears you want to know the difference between:

$a = "foo"; 
@f = $a =~ /foo/; 

and

$a = "foo1"; 
@f = $a =~ /foo(1)?/; 

Since they both return the same thing regardless if a capture was made.

The answer is: Don't try and use the returned array. Check to see if $1 is not equal to ""

Brian Roach
  • 76,169
  • 12
  • 136
  • 161