4

just encountered the code for doing tab expansion in perl, here is the code:

1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;

I tested it to be working, but I am too much a rookie to understand this, anyone care to explain a bit about why it works? or any pointer for related material that could help me understand this would be appreciated, Thanks a lot.

user685275
  • 2,097
  • 8
  • 26
  • 32

2 Answers2

8

Perl lets you embed arbitrary code as replacement expressions in regexes.

$& is the string matched by the last pattern match—in this case, some number of tab characters.

$` is the string preceding whatever was matched by the last pattern match—this lets you know how long the previous text was, so you can align things to columns properly.

For example, running this against the string "Something\t\t\tsomething else", $& is "\t\t\t", and $` is "Something". length($&) is 3, so there are at most 24 spaces needed, but length($`)%8 is 1, so to make it align to columns every eight it adds 23 spaces.

Jon Purdy
  • 53,300
  • 8
  • 96
  • 166
rmmh
  • 6,997
  • 26
  • 37
  • It should be length($`)%8 instead of length($``)%8. Besides this, your answer is correct. – CRM Apr 30 '11 at 22:55
  • @mmh I cannot edit your post, so try to use this length($`)%8 – CRM Apr 30 '11 at 23:07
  • Nobody here has yet explained how come it’s `1 while`. – tchrist May 01 '11 at 00:14
  • Thanks for answering, as tchrist mentioned, can anyone explain the `1 while`? – user685275 May 01 '11 at 16:55
  • w.r.t. the `1 while`, it's just a "no-op" so that the predicate form of the while loop can be used. It would be equivalent to write: while ($string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e) {} – Dave Goodell Nov 21 '11 at 17:58
2

The e flag on the regex means to treat the replacement string (' ' x (...etc...) as perl code and interpret/execute it for each match. So, basically look for any place there's 1 or more (+) tab characters (\t), then execute the small perl snippet to convert those tabs into spaces.

The snippet calculates how many tabs were matched, multiplies that number by 8 to get the number of spaces required, but also accounts for anything which may have come before the matched tabs.

Marc B
  • 356,200
  • 43
  • 426
  • 500