1

I am trying to do a dynamic search and replace with Perl on the command line with part of the replacement text being the output of a grep command within backticks. Is this possible to do on the command line, or will I need to write a script to do this?

Here is the command that I thought would do the trick. I thought that Perl would treat the backticks as a command substitution, but instead it just treats the backticks and the content within them as a string:

perl -p -i -e 's/example.xml/http:\/\/exampleURL.net\/`grep -ril "example_needle" *`\/example\/path/g' `grep -ril "example_needle" *`

UPDATE:

Thanks for the helpful answers. Yes, there was a typo in my original one-liner: the target file of grep is supposed to be *.

I wrote a small script based on Schewrn's example, but am having confusing results. Here is the script I wrote:

 #!/usr/bin/env perl -p -i

my $URL_First = "http://examplesite.net/some/path/";
my $URL_Last = "/example/example.xml";

my @files = `grep -ril $URL_Last .`;
chomp @files;

foreach my $val (@files) {
        @dir_names = split('/',$val);

        if(@dir_names[1] ne $0) {

            my $url = $URL_First .  @dir_names[1] . $URL_Last;

            open INPUT, "+<$val" or die $!;

            seek INPUT,0,0;

            while(<INPUT>) {
                    $_ =~ s{\Q$URL_Last}{$url}g;
                    print INPUT $_;
                    }
            close INPUT;
            }
    }

Basically what I am trying to do is:

  1. Find files that contain $URL_Last.
  2. Replace $URL_Last with $URL_First plus the name of the directory that the matched file is in, plus $URL_Last.
  3. Write the above change to the input file without modifying anything else in the input file.

After running my script, it completely garbled the HTML code in the input file and it cut off the first few characters of each line in the file. This is strange, because I know for sure that $URL_Last only occurs once in each file, so it should only be matched once and replaced once. Is this being caused by a misuse of the seek function?

Dominique
  • 374
  • 4
  • 16
  • 1
    If I am not mistaken, `grep -ril "example_needle"` is lacking a target file, as you cannot print a file name on stdin. Perhaps you should just try to explain what you are trying to do instead. – TLP Jan 24 '12 at 02:27
  • take the time to make a small sample file that you can edit into your question, and show what you need as output. Otherwise, we're just guessing. Good luck. – shellter Jan 24 '12 at 07:19
  • Seek is definitely unnecessary, since the file handle position will already be at the beginning when you open the file. As for your problem description, you will find that 1 example is worth more than 10 essays. I assume you want to, for example, open `foo/bar.html`, replace `/example/example.xml` with `http://examplesite.net/some/path/foo/example/example.xml`? – TLP Jan 25 '12 at 05:04
  • 1
    Well, since you vanished without clarifying again, I'll just point you towards [File::Find](http://search.cpan.org/perldoc?File::Find), which besides saving you the troublesome backticks also allows you to extract file name and directory. – TLP Jan 25 '12 at 05:58
  • 1
    Try this out: http://codepad.org/BFpIwVtz – TLP Jan 25 '12 at 06:05
  • Thanks, I ended up using Tie::File but I am sure File::Find would've also done the trick. – Dominique Jan 26 '12 at 04:10

3 Answers3

2

You should use another delimiter for s/// so that you don't need to escape slashes in the URL:

perl -p -i -e '
s#example.xml#http://exampleURL.net/`grep -ril "example_needle"`/example/path#g'
    `grep -ril "example_needle" *`

Your grep command inside the regex will not be executed, as it is just a string, and backticks are not meta characters. Text inside a substitution will act as though it was inside a double quoted string. You'd need the /e flag to execute the shell command:

perl -p -i -e '
s#example.xml#
    qq(http://exampleURL.net/) . `grep -ril "example_needle"` . qq(/example/path)
    #ge'
    `grep -ril "example_needle" *`

However, what exactly are you expecting that grep command to do? It lacks a target file. -l will print file names for matching files, and grep without a target file will use stdin, which I suspect will not work.

If it is a typo, and you meant to use the same grep as for your argument list, why not use @ARGV?

perl -p -i -e '
s#example.xml#http://exampleURL.net/@ARGV/example/path#g'
    `grep -ril "example_needle" *`

This may or may not do what you expect, depending on whether you expect to have newlines in the string. I am not sure that argument list will be considered a list or a string.

TLP
  • 66,756
  • 10
  • 92
  • 149
2

It seems like what you're trying to do is...

  1. Find a file in a tree which contains a given string.
  2. Use that file to build a URL.
  3. Replace something in a string with that URL.

You have three parts, and you could jam them together into one regex, but it's much easier to do it in three steps. You won't hate yourself in a week when you need to add to it.

The first step is to get the filenames.

# grep -r needs a directory to search, even if it's just the current one
my @files = `grep -ril $search .`;

# strip the newlines off the filenames
chomp @files;

Then you need to decide what to do if you get more than one file from grep. I'll leave that choice up to you, I'm just going to take the first one.

my $file = $files[0];

Then build the URL. Easy enough...

# Put it in a variable so it can be configured
my $Site_URL = "http://www.example.com/";

my $url = $Site_URL . $file;

To do anything more complicated, you'd use URI.

Now the search and replace is trivial.

# The \Q means meta-characters like . are ignored.  Better than
# remembering to escape them all.
$whatever =~ s{\Qexample.xml}{$url}g;

You want to edit files using -p and -i. Fortunately we can emulate that functionality.

#!/usr/bin/env perl
use strict;
use warnings; # never do without these

my $Site_URL   = "http://www.example.com/";
my $Search     = "example-search";
my $To_Replace = "example.xml";

# Set $^I to edit files. With no argument, just show the output
# script.pl .bak  # saves backup with ".bak" extension
$^I = shift;

my @files = `grep -ril $Search .`;
chomp @files;
my $file = $files[0];

my $url = $Site_URL . $file;

@ARGV = ($files[0]);  # set the file up for editing
while (<>) {
    s{\Q$To_Replace}{$url}g;
}
TLP
  • 66,756
  • 10
  • 92
  • 149
Schwern
  • 153,029
  • 25
  • 195
  • 336
  • Using `-pi` in the shebang is nice, but in this case, I'd not go that way. If you're going for the long version, saving some typing is moot. You reduce performance by executing the `grep` for every line in the file. In this case I'd just add a `while(<>)` and `print` around the substitution, then set `$^I` to do the in-place edit. – TLP Jan 24 '12 at 04:50
  • Oh, and perhaps `$^I = shift; push @ARGV, $files[0];` – TLP Jan 24 '12 at 05:07
  • @TLP Oh, good point about the grep! I'm going to community wiki this one so you can fix it up if you like. – Schwern Jan 25 '12 at 03:22
  • Without knowing how he intends to use the grep data exactly, this will only be an approximation, but it will emulate `-pi`. – TLP Jan 25 '12 at 03:51
0

Everyone's answers were very helpful to my writing a script that wound up working for me. I actually found a bash script solution yesterday, but wanted to post a Perl answer in case anyone else finds this question through Google.

The script that @TLP posted at http://codepad.org/BFpIwVtz is an alternative way of doing this.

Here is what I ended up writing:

#!/usr/bin/perl

use Tie::File;

my $URL_First = 'http://example.com/foo/bar/';
my $Search = 'path/example.xml';
my $URL_Last = '/path/example.xml';

# This grep returns a list of files containing "path/example.xml"
my @files = `grep -ril $Search .`;
chomp @files;

foreach my $File_To_Edit (@files) {

# The output of $File_To_Edit looks like this: "./some_path/index.html"
# I only need the "some_path" part, so I'm going to split up the output and only use @output[1] ("some_path")
    @output = split('/',$File_To_Edit);

# "some_path" is the parent directory of "index.html", so I'll call this "$Parent_Dir"
    my $Parent_Dir = @output[1];

# Make sure that we don't edit the contents of this script by checking that $Parent_Dir doesn't equal our script's file name.
    if($Parent_Dir ne $0) {

            # The $File_To_Edit is "./some_path/index.html"
            tie @lines, 'Tie::File', $File_To_Edit or die "Can't read file: $!\n";
            foreach(@lines) {
                    # Finally replace "path/example.xml" with "http://example.com/foo/bar/some_path/path/example.xml" in the $File_To_Edit
                    s{$Search}{$URL_First$Parent_Dir$URL_Last}g;
                    }
            untie @lines;
            }
    }
Dominique
  • 374
  • 4
  • 16