-2

As you can see in the script example down there, I'm trying to find and replace (inplace replace) a string which is part of an URL with another string (url part) within ALL files found with that pattern of URL.

In other words, all *.txt files which contain 'https://hostname/' must be found and then the string 'https://hostname/' must be replaced within the whole file with 'https://hostname.fq.dn/'

so far I put together this:

grep -L -R -e 'https://hostname/' /srv/www/htdocs/intranet/data/pages/ | grep '.txt$' | xargs -n1 sed -i.bak 's|https://hostname/|https://hostname.fq.dn/|'

I guess there must be an error within the sed regex somewhere. I searched and read for hours and still not found the problem. Backup files are created by sed, but nothing seems to be replaced within the files.

Any tip for me? Like how I can debug the regex I'm using in sed? I'm pretty lost and even didn't found anything like that here on stackexchange/serverfault yet.

Axel Werner
  • 156
  • 1
  • 12
  • huh ?! Two downvotes so far but no comment or tip at all ?! why is that ?! – Axel Werner Aug 07 '15 at 15:43
  • the regex expression " s|https://hostname/|https://hostname.fq.dn/|g " should do it i think, tried it...didnt worked neither. what the heck im missing here ? https://hostname/ is not matching at all.... – Axel Werner Aug 07 '15 at 15:51

2 Answers2

2

OMG! Im so stupid. I finally found the error here. The regEx is just fine and working properly. Its the grep option '-L' that is totally crap. Its supposed to be '-l' to just return the path and filenames that matches.

grep -l -R -e 'https://hostname/' /srv/www/htdocs/intranet/data/pages/*.txt | xargs -n1 sed -i.bak 's|https://hostname/|https://hostname.fq.dn/|'

duh!!! sorry for the trouble here.

Axel Werner
  • 156
  • 1
  • 12
1

I recommend to debug on one selected file:

sed 's|something|replacement|g' a_file.txt

to see what happens.

When your regular expression is debugged, you can simplify your command like this:

shopt -s globstar
sed -i.bak 's|https://hostname/|https://hostname.fq.dn|g' /srv/www/htdocs/intranet/data/pages/{,**/}*.txt

There's no need for the grep and xargs.

The regular expression you provided works for me on a testing file containing just https://hostname/.

buff
  • 111
  • 3
  • thanks for this tip. in reality the txt files im supposed to search through do contain strings like `https://hostname/subdir/page.php?moreUrlParms`, and im just trying to replace the hostname part with a new one. and it didnt worked at all. but i need to test it as you suggested without grep and xargs. thanks – Axel Werner Aug 07 '15 at 16:02
  • 1
    I missed that your txt files can be in subdirectories: please see updated command. Still: debug first on a single file - don't run a complex command until you sort out the regex first. – buff Aug 07 '15 at 16:06