0

I have a directory structure like the following toy example

DirectoryTo
DirectoryFrom
-Dir1
---File1.txt
---File2.txt
---File3.txt
-Dir2
---File4.txt
---File5.txt
---File6.txt
-Dir3
---File1.txt
---File5.txt
---File7.txt

I'm trying to copy all the files from DirectoryFrom to DirectoryTo, keeping the newer file if there are duplicates.

DirectoryTo
-File1.txt
-File2.txt
-File3.txt
-File4.txt
-File5.txt
-File6.txt
-File7.txt
DirectoryFrom
-Dir1
---File1.txt
---File2.txt
---File3.txt
-Dir2
---File4.txt
---File5.txt
---File6.txt
-Dir3
---File1.txt
---File5.txt
---File7.txt

I've created a text file with a list of all the subdirectories. This list is in the order such that the NEWEST files will be listed first:

Filelist.txt

C:/DirectoryFrom/Dir1
C:/DirectoryFrom/Dir2
C:/DirectoryFrom/Dir3

So what I'd like to do is loop through each directory in Filelist.txt, copy the files, and NOT replace if the file already exists.

I'd like to do this at the command line, in a shell script, or possibly in Python. I'm pretty new to Python, but have a little experience with the command line. However, I've never done something this complicated.

In reality, I have ~60 folders, each with 50-200 files in them, to give you a feel for how many I have. Also, each file is ~75MB.

I've done something similar in R before, but it's slow and not really meant for this. But here's what I've tried for a shell script, edited to fit this toy example:

#!/bin/bash

for line in Filelist.txt
do
    cp -n line C:/DirectoryTo/
done
Gaius Augustus
  • 940
  • 2
  • 15
  • 37

2 Answers2

1

If you have only one one directory level in your DirectoryFrom then you can use:

cp -n DirectoryFrom/*/* DirectoryTo

explanation : copy every file which exist in subdirectories of DirectoryFrom to DirectoryTo if it doesn't exist

n flag is for not overwriting files if they already exist.

cp will also ignore directories if they exist in subdirectories of DirectoryTo

dnit13
  • 2,478
  • 18
  • 35
  • I'm trying this now. I assume it goes through directories in alphabetical order? If so, this will work for me. Otherwise, I need to be able to tell it the order of directories. – Gaius Augustus Feb 24 '16 at 19:11
  • @GaiusAugustus yes traversal is alphabatically – dnit13 Feb 24 '16 at 19:17
1
# Create test environnement :
mkdir C:/DirectoryTo
mkdir C:/DirectoryFrom
cd C:/DirectoryFrom
mkdir Dir1 Dir2 Dir3 
(
cat << EOF
Dir1/File1.txt
Dir1/File2.txt
Dir1/File3.txt
Dir2/File4.txt
Dir2/File5.txt
Dir2/File6.txt
Dir3/File1.txt
Dir3/File5.txt
Dir3/File7.txt
EOF
)| while read f
do
echo "$f : `date`" 
echo "$f : `date`" > $f
sleep 1
done


# create Filelist.txt file :
(
cat << EOF
C:/DirectoryFrom/Dir1
C:/DirectoryFrom/Dir2
C:/DirectoryFrom/Dir3
EOF
) > Filelist.txt


# Generate the liste of all files : 
cd C:/DirectoryFrom
cat Filelist.txt | while read f; do ls -1 $f; done | sort -u > filenames.txt
cat filenames.txt


# liste of all files path, sorted by time order :
cd C:/DirectoryFrom
ls -1tr */* > all_filespath_sorted.txt
cat all_filespath_sorted.txt


# selected files to be copied :
cat  filenames.txt | while read f; do cat all_filespath_sorted.txt | grep $f | tail -1 ; done


# copy of selected files:
cat  filenames.txt | while read f; do cat all_filespath_sorted.txt | grep $f | tail -1 ; done | while read c
do
echo $c
cp -p $c C:/DirectoryTo
done

# verifying :
cd C:/DirectoryTo
ls -ltr

# or
ls -1 | while read f; do echo -e "\n$f\n-------"; cat $f; done



#------------------------------------------------
# Other solution for a limited number of files :
#------------------------------------------------

# To list files by order :
find `cat  Filelist.txt | xargs` -type f | xargs ls -1tr 

# To copy files, the newer will replace the older :
find `cat  Filelist.txt | xargs` -type f | xargs ls -1tr | while read c
do
echo $c
cp -p $c C:/DirectoryTo
done
Ali ISSA
  • 398
  • 2
  • 10