0

I want to check if a directory in created on today's date. If it does then upload it on HDFS but if the modifying date of the dierectory is different then output as that directory already copied on HDFS.

#!/bin/sh
export DATA_PATH=/data/1/sanket
#We will enter the directory where we want to check other directories
cd $DATA_PATH

#Details of directories will be transfered into no_of_files.txt
ls -lh $DATA_PATH/ >> temp/no_of_files.txt

#We will extract name of the file from above file.
nameoffile=$(awk '{print $9}' temp/no_of_files.txt)

#Now we want today's date.
echo $(date) >> temp/date.txt

#So the modifying date and todays date will be copied to a variable.
filedate=$(awk '{print $6 $7}' temp/no_of_files.txt)
todaydate=$(awk '{print $2 $3}' temp/date.txt)

export "nameoffile"
export "filedate"
export "todaydate"

rm -fr $DATA_PATH/temp/no_of_files.txt
rm -fr $DATA_PATH/temp/name_of_files.txt
rm -fr $DATA_PATH/temp/date.txt

#Directory on HDFS where we want to copy data    
path=sanket_data

#First to check that modifying date of directory and today's date to match and if so
#then copy the data on HDFS, if they dont match then give error as file already copied.

if [[ "$filedate" == "$todaydate" ]]; then
for filename in $nameoffile; do
    #path=sanket_data
    #nameoffile=$(awk '{print $9}' temp/no_of_files.txt)
    #for filename in $nameoffile
    /usr/bin/hadoop fs -put $DATA_PATH/$filename /user/sanket/$path
    #echo $filename already copied!
    done
elif [[ "$filedate" != "$todaydate" ]]; then

    #/usr/bin/hadoop fs -put $DATA_PATH/$filename /user/sanket/$path
    echo $filename already copied!
    #hdfs dfs -put $filename /user/sanket/$path
fi
sanketthodge
  • 19
  • 1
  • 4
  • 1
    Just a side question: do you know that the last modification date of a directory has very few to do with the last modification date of its content? If not, you should maybe have a look at [this](http://stackoverflow.com/questions/3620684/directory-last-modified-date) – Renaud Pacalet Aug 27 '15 at 06:58
  • Hi @RenaudPacalet , I went through the link that you shared. But all I care about is the modifying date of the directory and not its content. Thanks. – sanketthodge Aug 27 '15 at 07:10

1 Answers1

1

What you need is touch, date and newer. Let us first use touch and date to create a temporary empty file with a Last Modification Date (LMD) of today at 00:00:

touch -d $( date +%F ) today0000

Now let us use newer to test whether $DATA_PATH has a more recent LMD than today0000:

if $( newer $DATA_PATH today0000 ); then
  /usr/bin/hadoop fs -put $DATA_PATH/* /user/sanket/$path
else
  echo "$DATA_PATH older than today ($( date +%F ))"
fi
Renaud Pacalet
  • 25,260
  • 3
  • 34
  • 51