0

I find the following statement in a perl (actually PDL) program:

/\/([\w]+)$/i;

Can someone decode this for me, an apprentice in perl programming?

Himanshu
  • 31,810
  • 31
  • 111
  • 133
BoomerTN
  • 3
  • 2

6 Answers6

11

Sure, I'll explain it from the inside out:

\w - matches a single character that can be used in a word (alphanumeric, plus '_')

[...] - matches a single character from within the brackets

[\w] - matches a single character that can be used in a word (kinda redundant here)

+ - matches the previous character, repeating as many times as possible, but must appear at least once.

[\w]+ - matches a group of word characters, many times over. This will find a word.

(...) - grouping. remember this set of characters for later.

([\w]+) - match a word, and remember it for later

$ - end-of-line. match something at the end of a line

([\w]+)$ - match the last word on a line, and remember it for later

\/ - a single slash character '/'. it must be escaped by backslash, because slash is special.

\/([\w]+)$ - match the last word on a line, after a slash '/', and remember the word for later. This is probably grabbing the directory/file name from a path.

/.../ - match syntax

/.../i - i means case-insensitive.

All together now:

/\/([\w]+)$/i; - match the last word on a line and remember it for later; the word must come after a slash. Basically, grab the filename from an absolute path. The case insensitive part is irrelevant, \w will already match both cases.

More details about Perl regex here: http://www.troubleshooters.com/codecorn/littperl/perlreg.htm

And as JRFerguson pointed out, YAPE::Regex::Explain is useful for tokenizing regex, and explaining the pieces.

Tim
  • 8,912
  • 3
  • 39
  • 57
  • Thank you Tim...excellent description and fits perfectly with the context of the code. Looks like I need more regex experience after 40+ years of programming! – BoomerTN Sep 10 '12 at 18:37
5

You will find the Yape::Regex::Explain module worth installing.

#!/usr/bin/env perl
use YAPE::Regex::Explain;
#...may need to single quote $ARGV[0] for the shell...
print YAPE::Regex::Explain->new( $ARGV[0] )->explain;

Assuming this script is named 'rexplain' do:

$ ./rexplain '/\/([\w]+)$/i'

...to obtain:

The regular expression:

(?-imsx:/\/([\w]+)$/i)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
  \/                       '/'
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [\w]+                    any character of: word characters (a-z,
                             A-Z, 0-9, _) (1 or more times (matching
                             the most amount possible))
----------------------------------------------------------------------
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
  \/                       '/'
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [\w]+                    any character of: word characters (a-z,
                             A-Z, 0-9, _) (1 or more times (matching
                             the most amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
  /i                       '/i'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

UPDATE:

See also: https://stackoverflow.com/a/12359682/1015385 . As noted there and in the module's documentation:

There is no support for regular expression syntax added after Perl version 5.6, particularly any constructs added in 5.10.

Community
  • 1
  • 1
JRFerguson
  • 7,426
  • 2
  • 32
  • 36
2
/\/([\w]+)$/i;

It is a regex, and if it is a complete statement, it is applied to the $_ variable, like so:

$_ =~ /\/([\w]+)$/i;

It looks for a slash \/, followed by an alphanumeric string \w+, followed by end of line $. It also captures () the alphanumeric string, which ends up in the variable $1. The /i on the end makes it case-insensitive, which has no effect in this case.

TLP
  • 66,756
  • 10
  • 92
  • 149
2

While it doesn't help "explain" a regex, once you have a test case, Damian's new Regexp::Debugger is a cool utility to watch what actually occurs during the matching. Install it and then do rxrx at the command line to start the debugger, then type in /\/([\w]+)$/ and '/r' (for example), and finally m to start the matching. You can then step through the debugger by hitting enter repeatedly. Really cool!

Joel Berger
  • 20,180
  • 5
  • 49
  • 104
0

This is comparing $_ to a slash followed by one or more character (case insensitive) and storing it in $1

$_ value     then     $1 value 
------------------------------
"/abcdes"     |       "abcdes"
"foo/bar2"    |       "bar2"
"foobar"      |       undef      # no slash so doesn't match
vol7ron
  • 40,809
  • 21
  • 119
  • 172
0

The Online Regex Analyzer deserves a mention. Here's a link to explain what your regex means, and pasted here for the record.

Sequence: match all of the followings in order

/                                                  (slash)
                                               --+
Repeat                                           | (in GroupNumber:1)
   AnyCharIn[ WordCharacter] one or more times   |
                                               --+
EndOfLine
Chui Tey
  • 5,436
  • 2
  • 35
  • 44