0

Not idea how this is happening, but it seems my perl regular expression matches do not update to the next match, after doing a match. Instead of updating the $& and $1 variables with each match, it gets stuck in the first one.

I've looker everywhere and found this extremely frustrating.

See output from debugger below, as you can see, the first match makes sense, but the second one doesn't.

Thanks

  DB<79>  $riz =~ m{url=(.*?)Support};

  DB<80> p$&;
url="http://www.svartapelsin.se" draggingName="Bunny Camp Support
  DB<81> $riz =~ m{href=(.*artist?)};

  DB<82> p $&;
url="http://www.svartapelsin.se" draggingName="Bunny Camp Support
  DB<83>

Update: Here's another sample showing that the text "artist" is in the string, but it is still not finding it. The value of $riz is a huge HTML code, so it is hard to post.

DB<103> $riz =~ m{url=(.*?)Support};

  DB<104> p $&;
url="http://www.svartapelsin.se" draggingName="Bunny Camp Support
  DB<105> $riz =~ m{artist};

  DB<106> p $&;
url="http://www.svartapelsin.se" draggingName="Bunny Camp Support
  DB<107> p  string.index($riz,"artist");
string105
  DB<108>

My $riz is all the HTML in this link http://itunes.apple.com/us/app/id385972277

When you use the user agent iTunes/10.2 (Macintosh; U; PPC Mac OS X 10.2)


Here's another example with the same $riz

  DB<128>  $riz =~ m/.*/;

  DB<129> p $&;
url="http://www.svartapelsin.se" draggingName="Bunny Camp Support
  DB<130>
...
    DB<136> p substr $riz,0,20;
    <?xml version="1.0"
      DB<137>

I mean, isn't this just ridiculous? it should've just outputted the value of $riz no? Which as you can see is different form what is shown. Also, how could m/.*/ not be a valid regex?

Jorge Guzman
  • 489
  • 1
  • 4
  • 16
  • 1
    If $riz does not match, then `$&` is not modified. – William Pursell Jun 13 '12 at 13:31
  • added some sample code.. – Jorge Guzman Jun 13 '12 at 13:47
  • The lesson to learn here it is *less error-prone* in the majority of cases to treat the match operation like a function and *simply use its return values*, and not to rely on the weird side-effect variables `$1`, `$&` and friends. Type `x $riz =~ m{url=(.*?)Support}` in the debugger. – daxim Jun 13 '12 at 13:52
  • What is $riz. You still haven't posted that. Also you seem to think you are writing javascript. "string.index" doesn't do what you seem to think. – Ben Jun 13 '12 at 14:04
  • added a reference to the right $riz and anothee example of what seems weird to me – Jorge Guzman Jun 13 '12 at 14:26

3 Answers3

2

$& is updated whenever there is a successful match. If the match does not succeed, then $& is not updated and retains its previous value. See the $MATCH variable in perlvar. (perldoc perlvar and search for $MATCH)

William Pursell
  • 204,365
  • 48
  • 270
  • 300
1

This is fine. $& contains the string that the last successful regex matched. I assume the contents of $riz don't contain a match for /href=(.*artist?)/. You should check the return value of the regex match.

Are you aware that /artist?/ will match only artist or artis?

Borodin
  • 126,100
  • 9
  • 70
  • 144
1

perldebug says this

Any command not recognized by the debugger is directly executed (eval'd) as Perl code in the current package.

Note that the said eval is bound by an implicit scope. As a result any newly introduced lexical variable or any modified capture buffer content is lost after the eval. The debugger is a nice environment to learn Perl, but if you interactively experiment using material which should be in the same scope, stuff it in one line.

So the $& and $1 etc. variables are localised during execution of a debugger command, and are lost one the command completes.

You could use

$riz =~ m{url=(.*?)Support}; print $&, "\n"; print $1, "\n";

or

$riz =~ m{url=(.*?)Support}; ($and, $one) = ($&, $1);
p $and
p $one

but without something to preserve those values in the same command line they are forever lost once the regex comparison completes.

Borodin
  • 126,100
  • 9
  • 70
  • 144
  • Just as an update, this was 100% the issue that made everything not work. Yes, my regular expression could have had mistakes, but those looked differently. Once I wrote the debugger commands correctly, everything else was super easy. +20 points to Borodin – Jorge Guzman Jun 14 '12 at 17:23
  • 1
    Highly related from just a days ago: [Why doesn't eval '/(…)/' set $1?](http://stackoverflow.com/questions/10909127/perl-why-doesnt-eval-set-1) – daxim Jun 15 '12 at 19:41