1

I am attempting to use either rsync or cp in a for loop to copy files matching a list of 200 of names stored on new lines in a .txt file that match filenames with the .pdbqt extension that are in a series of subdirectories with one parent folder. The .txt file looks as follows:

file01
file02
file08
file75
file45
...

I have attempted to use rsync with the following command:

rsync -a /home/ubuntu/Project/files/pdbqt/*/*.pdbqt \
--files-from=/home/ubuntu/Project/working/output.txt \
/home/ubuntu/Project/files/top/

When I run the rsync command I receive:

rsync error: syntax or usage error (code 1) at options.c(2346) [client=3.1.2]

I have written a bash script as follows in an attempt to get that to work:

#!/bin/bash
for i in "$(cat /home/ubuntu/Project/working/output.txt | tr '\n' '')"; do
    cp /home/ubuntu/Project/files/pdbqt/*/"$i".pdbqt /home/ubuntu/Project/files/top/;
done

I understand cat isn't a great command to use but I could not figure out an alternate solution to it, as I am still new to using bash. Running that I get the following error:

tr: when not truncating set1, string2 must be non-empty
cp: cannot stat '/home/ubuntu/Project/files/pdbqt/*/.pdbqt': No such file or directory

I assume that the cp error is thrown as a result of the tr error but I am not sure how else to get rid of the \n that is read from the new line separated list.

The expected results are that from the subdirectories in /pdbqt/ with the 12000 .pdbqt files the 200 files from the output.txt list would be copied from those subdirectories into the /top/ directory.

3 Answers3

0

for loops are good when your data is already in shell variables. When reading in data from a file, while ... read loops work better. In your case, try:

while IFS= read -r file; do  cp -i -- /home/ubuntu/Project/files/pdbqt/*/"$file".pdbqt  /home/ubuntu/Project/files/top/; done </home/ubuntu/Project/working/output.txt

or, if you find the multiline version more readable:

while IFS= read -r file
do
    cp -i -- /home/ubuntu/Project/files/pdbqt/*/"$file".pdbqt /home/ubuntu/Project/files/top/
done </home/ubuntu/Project/working/output.txt

How it works

  • while IFS= read -r file; do

    This starts a while loop reading one line at a time. IFS= tells bash not to truncate white space from the line and -r tells read not to mangle backslashes. The line is stored in the shell variable called file.

  • cp -i -- /home/ubuntu/Project/files/pdbqt/*/"$file".pdbqt /home/ubuntu/Project/files/top/

    This copies the file. -i tells cp to ask before overwriting an existing file.

  • done </home/ubuntu/Project/working/output.txt

    This marks the end of the while loop and tells the shell to get the input for the loop from /home/ubuntu/Project/working/output.txt

John1024
  • 109,961
  • 14
  • 137
  • 171
  • 1
    Thank you for the in depth explanation. I had seen some while examples but didn't think that was going to work in my case. – proteinmodels May 28 '19 at 05:54
  • You're welcome. The issue with the `for` loop approach is that it is difficult to control the various expansions (principally _word splitting_ and _pathname expansion_) that the shell applies to it. The `while...read` approach avoids all that. – John1024 May 28 '19 at 05:59
0

Do dirs in Project/files/pdbqt/* or files *.pdbqt have dashes (-) in the name?

The error is showing the line in rsync source code options.c "Your options have been rejected by the server.\n"

which makes me think that it's interpreting inodes (files/dirs) in your glob as rsync options.

for i in $( < /home/ubuntu/Project/working/output.txt LC_CTYPE=C tr '\n' ' ' )
do
  cp /home/ubuntu/Project/files/pdbqt/*/"${i}.pdbqt" /home/ubuntu/Project/files/top/
done

I think your cat tr is missing a space

cat /home/ubuntu/Project/working/output.txt | tr '\n' ' '

John1024's use of while and read are better than mine.

wrothe
  • 25
  • 1
  • 6
0

Your are thinking correctly to think rsync. rsync provides the option --files-from="yourfile" that will rsync all the files in your textfile (relative to the base directory you specify next) to the destination (either host:/dest/path or locally with /dest/path alone)

You will want to specify the --no-R to tell rsync not to use relative filenames since --files-from= takes the base path as the next argument. For example, to transfer all files in your text file to some remote host where the location of the files specified are in the current directory, you could use:

rsync -uai --no-R --files-from="textfile" ./ host:/dest/path

Where the command essentially specifies you read the names to transfer from textfile where the files will be found under ./ (the current directory) and you will transfer the files to host:/dest/path on the host you specify. You can see man 1 rsync for full details.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • Thanks for the insightful comment, it made how rsync works a fair bit clearer. When attempting to do it this way however, it only searched in the current directory and not into the subdirectories from the current tree? – proteinmodels May 28 '19 at 06:08
  • No, it searches from the base directory for whatever the file names in your text file say the files are. Let's say your text file has `"project/1/study.txt\nproject/2/data.txt\n..."`, then if your base directory is `./` it looks for `./project/1/study.txt`, or if you base directory was `/home/you/`, it would look for `/home/you/project/1/study.txt`. (that's the reason `--no-R` is needed, so `rsync` doesn't attempt to append relative names to your base directory) – David C. Rankin May 28 '19 at 16:01