1

I have recently been learning about the \x modifier in Perl Best Practices, enabling you to do cool things like multi-line indentation and documentation:

$txt =~ m/^                     # anchor at beginning of line
      The\ quick\ (\w+)\ fox    # fox adjective
      \ (\w+)\ over             # fox action verb
      \ the\ (\w+) dog          # dog adjective
      (?:                       # whitespace-trimmed comment:
        \s* \# \s*              #   whitespace and comment token
        (.*?)                   #   captured comment text; non-greedy!
        \s*                     #   any trailing whitespace
      )?                        # this is all optional
      $                         # end of line anchor
     /x;                        # allow whitespace

However, I was unable to do the equivalent for find/replace string substitutions? Is there some other similar best practice that should be used to more effectively manage complex substitutions?

Edit Take this for an example:

$test =~ s/(src\s*=\s*['"]?)(.*?\.(jpg|gif|png))/${1}something$2/sig;

Is there a similar way that this could be documented using multi-line/whitespace for better readability?

Many thanks

user1027562
  • 265
  • 4
  • 13

4 Answers4

2

Since you've chosen not to provide an example of something that doesn't work, I'll offer a few guesses at what you might be doing wrong:

  • Note that the delimiter (in your case /) cannot appear inside any comments inside the regex, because then they'll be indicating the end of the regex. For example, this:

    s/foo # this is interesting and/or cool
     /bar/x
    

    will not work, because the regex is terminated by the slash between and and or.

  • Note that /x does not work on the replacement-string, only on the regex itself. For example this:

    s/foo/bar # I love the word bar/x
    

    will replace foo with bar # I love the word bar.

    If you really want to be able to put comments in the replacement-string, then I suppose you could use a replacement-expression instead, using the /e flag. That would let you use the full syntax of Perl. For example:

    s/foo/'bar' # I love the word bar/e
    

Here is an example that does work:

$test =~
  s/
    # the regex to replace:
    (src\s*=\s*['"]?)      # src=' or src=" (plus optional whitespace)
    (.*?\.(jpg|gif|png))   # the URI of the JPEG or GIF or PNG image
  /
    # the string to replace it with:
    $1 .                   # src=' or src=" (unchanged)
    'something' .          # insert 'something' at the start of the URI
    $2                     # the original URI
  /sige;
ruakh
  • 175,680
  • 26
  • 273
  • 307
  • This is a good answer so far. But you could point out that the choice of delimiter is arbitrary: `s{foo}` and on a new line `{bar}` etc. This sometimes avoids the `/e` switch. – amon Feb 18 '13 at 22:46
  • @amon: I don't follow. How would the choice of a different delimiter affect the use of `/e`? – ruakh Feb 19 '13 at 00:25
  • It allows more layout options than the slash delimiter, as the substitution may be seperated from the pattern by whitespace incl. newlines. This would solve your first example, but wouldn't make much difference with your last. Except that curlies may be preferable anyway when enclosing code. – amon Feb 19 '13 at 00:33
  • @amon: O.K., yes, I see what you mean. – ruakh Feb 19 '13 at 00:38
  • Thank you both. This answers my question. – user1027562 Feb 19 '13 at 14:19
1

If we just add the /x, we can break up the regular expression portion easily, including allowing comments.

my $test = '<img src = "http://www.somewhere.com/im/alright/jack/keep/your/hands/off/of/my/stack.gif" />';

$test =~ s/
    ( src \s* = \s* ['"]? ) # a src attribute ...
    ( .*? 
      \. (jpg|gif|png)      # to an image file type, either jpeg, gif or png
    )
    /$1something$2/sigx     # put 'something' in front of it
    ;

You have to use the evaluation switch (/e) if you want to break up the replacement. But the multi-line for the match portion, works fine.

Notice that I did not have to separate $1, because $1something is not a valid identifier anyway, so my version of Perl, at least, does not get confused.

For most of my evaluated replacements, I prefer the bracket style of substitution delimiter:

$test =~ s{
      ( src \s* = \s* ['"]? ) # a src attribute ... '
      ( .*? 
        \. (jpg|gif|png)      # to an image file type, either jpeg, gif or png
      )
    }{
        $1 . 'something' . $2
    }sigxe 
    ;

just to make it look more code-like.

Axeman
  • 29,660
  • 2
  • 47
  • 102
0

Well

$test =~ s/(src\s*=\s*['"]?)    # first group
        (.*?\.(jpg|gif|png))        # second group
        /${1}something$2/sigx;

should and does work indeed. Of course, you can't use this on the right part, unless you use somethig like :

$test =~ s/(src\s*=\s*['"]?)    # first group
        (.*?\.(jpg|gif|png))        # second group
        /
        $1              # Get 1st group
        . "something"   # Append ...
        . $2            # Get 2d group
        /sigxe;
Orabîg
  • 11,718
  • 6
  • 38
  • 58
0
s/foo/bar/

could be written as

s/
   foo     # foo
/
   "bar"   # bar
/xe
  • /x to allow whitespace in the pattern
  • /e to allow code in the replacement expression
ikegami
  • 367,544
  • 15
  • 269
  • 518