1

I would like to rename a bunch of files by changing only one part of the file name and doing that based on an exact match in a list in another file. For example, if I have these file names:

sample_ACGTA.txt
sample_ACGTA.fq.abc
sample_ACGT.txt
sample_TTTTTC.tsv
sample_ACCCGGG.fq
sample_ACCCGGG.txt
otherfile.txt

and I want to find and replace based on these exact matches, which are found in another file called replacements.txt:

ACGT    name1
TTTTTC  longername12
ACCCGGG nam7
ACGTA   another4

So that the desired resulting file names would be

sample_another4.txt
sample_another4.fq.abc
sample_name1.txt
sample_longername12.tsv
sample_nam7.fq
sample_nam7.txt
otherfile.txt

I do not want to change the contents. So far I have tried sed and mv based on my search results on this website. With sed I found out how to replace the contents of the file using my list:

while read from to; do
  sed -i "s/$from/$to/" infile ;
done < replacements.txt, 

and with mv I have found a way to rename files if there is one simple replacement:

for files in sample_*; do
  mv "$files" "${files/ACGTA/another4}"
done 

But how can I put them together to do what I would like?

Thank you for your help!

pgugger
  • 39
  • 7

2 Answers2

2

You can perfectly combine your for and while loops to only use mv:

while read from to ; do
  for i in test* ; do
    if [ "$i" != "${i/$from/$to}" ] ; then
      mv $i ${i/$from/$to}
    fi
  done
done < replacements.txt

An alternative solution with sed could consist in using the e command that executes the result of a substitution (Use with caution! Try without the ending e first to print what commands would be executed).

Hence:

sed 's/\(\w\+\)\s\+\(\w\+\)/mv sample_\1\.txt sample_\2\.txt/e' replacements.txt

would parse your replacements.txt file and rename all your .txt files as desired.

We just have to add a loop to deal with the other extentions:

for j in .txt .bak .tsv .fq .fq.abc ; do
  sed "s/\(\w\+\)\s\+\(\w\+\)/mv 'sample_\1$j' 'sample_\2$j'/e" replacements.txt
done

(Note that you should get error messages when it tries to rename non-existing files, for example when it tries to execute mv sample_ACGT.fq sample_name1.fq but file sample_ACGT.fq does not exist)

Qeole
  • 8,284
  • 1
  • 24
  • 52
  • Thank you, Qeole. I cannot get the `sed` command to work... do I need to edit something for my case? I also tried your first solution with just `mv`, which should work after I sort my replacements.txt file, as in the answer given above by Joe. – pgugger Jun 10 '14 at 18:49
  • @user2250055 The first `sed` line I wrote is only valid for `.txt` files. Could the problem come from this? What command exactly did you type? – Qeole Jun 10 '14 at 19:45
  • I copied and pasted your commands exactly, but mostly I was referring to the `sed` loop... I get an error for each file, for example: mv: cannot stat 'sample_ACGTA': No such file or directory. Any thoughts? The `sed` command line one works except that it doesn't handle non .txt extensions, as you mentioned. Thanks again! – pgugger Jun 10 '14 at 20:21
  • This is what I tried to explain (too briefly it seems) in the last remark of my answer, in parenthesis. What happens whith `for` loop is: consider each extension in list; for each one, apply all patterns in file replacements.txt for this extention, even if no file corrseponds to such pattern+extension, and try to rename. First time: `mv sample_ACGT.txt sample_name1.txt, mv sample_TTTTTC.txt longername12.txt` etc; second time `mv sample_ACGT.bak sample_name1.bak, mv sample_TTTTTC.bak longername12.bak` etc; even if `sample_ACGT.txt` or `sample_TTTTTTC.bak` don't exist. [1/2] – Qeole Jun 10 '14 at 21:01
  • @user2250055 [2/2] So those errors are just because the files don't exist, but I think it should not prevent existing files to be renamed as you wish. I couldn't find an easy way to avoid those errors, maybe we could just discard them by adding `2>/dev/null` somewhere. Is it clear enough? – Qeole Jun 10 '14 at 21:04
  • I think I am understanding correctly that the error messages are normal. However, when I run the script, it doesn't actually replace any file names (even if I suppress the error messages with `2>/dev/null`). – pgugger Jun 11 '14 at 21:36
  • @user2250055 I had made a mistake (wrote `\$j` instead of `$j`, I don't know how I managed to made those `\`s appear). I just corrected it in my answer. Now it seems to work for me, could you try again on your side? – Qeole Jun 11 '14 at 22:11
  • That's perfect! Thanks for bearing with me... much of this is still new to me! – pgugger Jun 11 '14 at 23:42
  • Thanks @Qeole . Works perfectly. I only want to add for dummies as me in bash code that if you have your matching string in the second column, you then change only the next part: `while read from to` to `while read to from` . Then it will work. – MagíBC Dec 13 '22 at 11:45
-1

You could use awk to generate commands:

% awk '{print "for files in sample_*; do mv $files ${files/" $1 "/" $2 "}; done" }' replacements.txt 
for files in sample_*; do mv $files ${files/ACGT/name1}; done
for files in sample_*; do mv $files ${files/TTTTTC/longername12}; done
for files in sample_*; do mv $files ${files/ACCCGGG/nam7}; done
for files in sample_*; do mv $files ${files/ACGTA/another4}; done

Then either copy/paste or pipe the output directly to your shell:

% awk '{print "for files in sample_*; do mv $files ${files/" $1 "/" $2 "}; done" }' replacements.txt | bash

If you want the longer match string to be used first, sort the replacements first:

% sort -r replacements.txt | awk '{print "for files in sample_*; do mv $files ${files/" $1 "/" $2 "}; done" }' | bash
Joe
  • 25,000
  • 3
  • 22
  • 44
  • Thank you, Joe. This mostly works... the only problem is that when it replaces ACGT with name1, it will replace the ACGTA with name1A instead of another4 as is specified in replacement.txt. Any additional suggestions? – pgugger Jun 05 '14 at 23:09
  • `sort -r replacements.txt` so the longest match string is used first. I'll add that to the answer. – Joe Jun 05 '14 at 23:13
  • I think `sort` works, except without `-r`... thanks again! Out of curiosity, would there also be any straightforward way to do it with `sed` as I was originally thinking? – pgugger Jun 05 '14 at 23:55
  • sed would required a slightly more complex regex, but could certainly perform the same function that awk is doing here. – Joe Jun 06 '14 at 00:02