0

Here's my example. If I want to use a regex to replace tabs in the code with spaces, but wanted to preserve tab characters in the middle or end of a line of code, I would use this as my search string to capture each tab character at the start of a line: ^(\t)+

Now, how could I write a search string that replaces each captured group with four spaces? I'm thinking there must be some way to do this with backreferences?

I've found I can work around this by running similar regex-replacements (like s/^\t/ /g, s/^ \t/ /g, ...) multiple times until no more matches are found, but I wonder if there's a quicker way to do all the necessary replacements at once.

Note: I used sed format in my example, but I'm not sure if this is possible with sed. I'm wondering if sed supports this, and if not, is there a platform that does? (e.g., there's a Python/Java/bash extended regex lib that supports this.)

brokethebuildagain
  • 2,162
  • 1
  • 22
  • 44

4 Answers4

2

With perl and other languages that support this feature (Java, PCRE(PHP, R, libboost), Ruby, Python(the new regex module), .NET), you can use the \G anchor that matches the position after the last match or the start of the string:

s/(?:\G|^)\t/    /gm
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
1

This works in Perl. Maybe sed too, I don't know sed.
It relies on doing an eval, basically a callback.
It takes the length of $1 then cats ' ' that many times.

Perl sample.

my $str = "
\t\t\tThree
\t\tTwo
\tOne
None";

$str =~ s/^(\t+)/ '    ' x length($1) /emg;

print "$str\n";

Output

            Three
        Two
    One
None
  • That's cool. It would work as long as the string to be replaced is one character long. Is the `x` a multiply? – brokethebuildagain Sep 23 '14 at 20:26
  • Yeah, that's a Multiply operator in Perl that works on strings. Its like `for($i = 0; $i < length($1); $i++) { $replace .= ' '; }`. The regex will match at least 1 character. –  Sep 23 '14 at 21:35
0

Just another idea that came to me, this could also be solved with positive lookbehind:

s/(?<=^[\t]*)\t/    /gm

It's ugly, but it works.

brokethebuildagain
  • 2,162
  • 1
  • 22
  • 44
  • 1
    Variable length lookbehind is not supported on %99 of the engines out there, otherwise it would be great. –  Sep 23 '14 at 21:31
0
sed ':a
   s/^\(\t*\)\t/\1    /
   ta' YourFile

Use recursive action on 1 regex with sed, it's a workaround

NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43