0

I have a text file with a list of files with the structure ABC123456A or ABC123456AA. What I would like to do is check whether the files ABC123456ZZP also exists. i.e I want to substitute the letter(s) after ABC123456 with ZZP

Can I do this using sed?

moadeep
  • 3,988
  • 10
  • 45
  • 72

3 Answers3

2

Like this?

X=ABC123456 ;  echo ABC123456AA | sed -e "s,\(${X}\).*,\1ZZP,"
wilx
  • 17,697
  • 6
  • 59
  • 114
  • Exactly this. Thanks very much – moadeep Jan 14 '13 at 12:00
  • That is the wrong way to do it as it's inefficient and will fail for various file names. – Ed Morton Jan 14 '13 at 14:08
  • The file names are autogenerated and have the same format 3 letters followed by 2 numbers (the year) then 4 further numbers which increment from 0000 to 9999. The letters which follow can vary but they are not so important as long as I can access the first 9 characters – moadeep Jan 14 '13 at 16:06
  • Then for the right way to do that see the response by @peteches or mine. A pipe to sed is the wrong approach, at best it's less efficient than the normal shell solution and if you really WANT to pipe to something to get the first 9 characters just pipe to `cut -c1-9` as that'd be more efficient and more robust than using sed. I still wouldn't do it though when the shell builtins work just fine. – Ed Morton Jan 14 '13 at 17:54
1

You could use sed as wilx suggests but I think a better option would be bash.

while read file; do
    base=${file:0:9}
    [[ -f ${base}ZZP ]] && echo "${base}ZZP exists!"
done < file

This will loop over each line in file then base is set to the first 9 characters of the line (excluding whitespace) then check to see if a file exists with ZZP on the end of base and print a message if it does.

peteches
  • 3,447
  • 1
  • 13
  • 15
  • That will fail for file names that contain spaces, backslashes, etc. so while it might work for the OPs example it's wrong in general. – Ed Morton Jan 14 '13 at 14:09
  • Granted in general this would need a dedicated function to sanitise the path to ensure meta-characters spaces and other oddities are escaped but when you are given a specific format of filenames that would be uneccesary. – peteches Jan 14 '13 at 14:43
  • You don't need a dedicated function, just use the correct form of "read" (`IFS= read -r`). It's not noticeably harder to do it robustly like that and will save you shooting yourself in the foot later. – Ed Morton Jan 14 '13 at 17:49
0

Look:

$ str="ABC123456AA"
$ echo "${str%[[:alpha:]][[:alpha:]]*}"
ABC123456

so do this:

while IFS= read -r tgt; do
    tgt="${tgt%[[:alpha:]][[:alpha:]]*}ZZP"
    [[ -f "$tgt" ]] && printf "%s exists!\n" "$tgt"
done < file

It will still fail for file names that contain newlines so let us know if you have that situation but unlike the other posted solutions it will work for file names with other than 9 key characters, file names containing spaces, commas, backslashes, globbing characters, etc., etc. and it is efficient.

Since you said now that you only need the first 9 characters of each line and you were happy with piping every line to sed, here's another solution you might like:

cut -c1-9 file |
while IFS= read -r tgt; do
    [[ -f "${tgt}ZZP" ]] && printf "%sZZP exists!\n" "$tgt"
done

It'd be MUCH more efficient and more robust than the sed solution, and similar in both contexts to the other shell solutions.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • I was waiting for your `awk` solution for this :) – Mirage Jan 15 '13 at 04:01
  • awk is a tool for processing text, not for manipulating (e.g. testing for existence of) files - that and manipulating processes is what shell is for. You could obviously replace cut with awk but cut's a better tool for this job. Sorry to disappoint :-). – Ed Morton Jan 15 '13 at 05:41
  • i have answered this question http://stackoverflow.com/questions/14329395/using-awk-to-remove-specific-white-space-and-replace-with-semicolon/14329681#14329681 but i am pretty sure you can refine that with few lines . It looked dirty to me but it worked. have a look – Mirage Jan 15 '13 at 06:36
  • I just posted an answer there. – Ed Morton Jan 15 '13 at 16:03