1

I would like to compare two binary files (very small, 100Kb each) and replace the oldest with the last modified one.

I have created a simple script, but I would need your help to make it running properly:

#!/bin/sh
# select the two files
FILE1="/dir1/file1.binary"
FILE2="/dir2/file2.binary"

# create the hash of the two files
HASH1="$(md5sum $FILE1 | cut -c 1-32)"
HASH2="$(md5sum $FILE2 | cut -c 1-32)"

# compare the two hashes
if [ "$HASH1" == "$HASH2" ];

# if the two hashes are the same, exit
then
echo "the two files are identical"
exit 0

# otherwise compare which of them has been last modified
fi
DATE1="(stat -c %Y $FILE1)"
DATE2="(stat -c %Y $FILE2)"

# if FILE1 is newer than FILE2, replace FILE2 with FILE1
if [ "${DATE1}" -gt "${DATE2}" ];
then
cp $FILE1 $FILE2
echo "${FILE2} was replaced by ${FILE1}"

# if FILE2 is newer than FILE1, replace FILE1 with FILE2
fi
cp $FILE2 $FILE1
echo "${FILE1} was replaced by ${FILE2}"
exit 0

The file seems working (at least if the two files are identical), but if one file has been modified, I receive the following error:

line 24: [: {(stat -c %Y test1)}: integer expression expected

What is wrong?

By the way, is there a better way to solve this problem?

Thanks


Thank you so much everybody for your help. Here is how the script looks like now. There is also notification on QTS for QNAP, but it can be taken out if running elsewhere or not needed.

#!/bin/sh

# select the two files
FILE1="/dir1/file1"
FILE2="/dir2/file2"

# use or create a log file with timestamp of the output
LOG="/dir1/ScriptLog.txt"
TIMESTAMP=$(date +"%Y-%m-%d %Hh:%M")

if [ ! -e $LOG ]; then
    touch $LOG
    echo "$TIMESTAMP - INFO: '$LOG' does not exists but has been created." >&2
# else
#   echo "$TIMESTAMP - INFO: '$LOG' exists and it will be used if any change to '$FILE1' 
#   or to '$FILE2' is needed." >&2
fi

# You can also pass the two file names as arguments for the script
if [[ $# == 2 ]]; then
    FILE1=$1
    FILE2=$2
fi

# check if the two files exist and are regular
if [ -f "$FILE1" -a -f "$FILE2" ]; then

    # meanwhile compare FILE1 against FILE2
    # if files are identical, stop there
    if cmp "$FILE1" "$FILE2" 2>/dev/null>/dev/null; then
        echo "$TIMESTAMP - INFO: '$FILE1' and '$FILE2' are identical." >&2 | >> $LOG

    # if FILE1 is newer than FILE2, copy FILE1 over FILE2
    elif [ "$FILE1" -nt "$FILE2" ]; then
        if cp -p "$FILE1" "$FILE2"; then
        echo "$TIMESTAMP - INFO: '$FILE1' replaced '$FILE2'." >&2 | >> $LOG
            # if copy is successful, notify it into QTS
            /sbin/notice_log_tool -a "$TIMESTAMP - INFO: '$FILE1' replaced '$FILE2'." --severity=5 >&2
        else
            echo "$TIMESTAMP - ERROR: FAILED to replace '$FILE2' with '$FILE1'." >&2 | >> $LOG
            exit 1
        fi

    # if FILE1 is older than FILE2, copy FILE2 over FILE1
    elif [ "$FILE1" -ot "$FILE2" ]; then 
        if cp -p "$FILE2" "$FILE1"; then
            echo "$TIMESTAMP - INFO: '$FILE2' replaced '$FILE1'." >&2 | >> $LOG
            # if copy is successful, notify it into QTS
            /sbin/notice_log_tool -a "$TIMESTAMP - INFO: '$FILE2' replaced '$FILE1'." --severity=5 >&2
        else
            echo "$TIMESTAMP - ERROR: FAILED to replace '$FILE2' with '$FILE1'." >&2 | >> $LOG
            exit 1
        fi

    # if two files are not identical but with same modification date
    else
        echo "$TIMESTAMP - ERROR: We should never reach this point. Something is wrong in the script." >&2 | >> $LOG
        exit 1
    fi

    # if one file does not exist or is not valid, exit
else
    echo "$TIMESTAMP - ERROR: One of the files does not exist, has been moved or renamed." >&2 | >> $LOG
        # if error, notify it into QTS
        /sbin/notice_log_tool -a "$TIMESTAMP - ERROR: One of the files does not exist, has been moved or renamed." --severity=5 >&2
    exit 1
fi
giopas
  • 645
  • 1
  • 6
  • 10
  • you don't even need md5 here. `-gt` compare number as it stated. you need to use `[ "${DATE1}" \> "${DATE2}" ]` or `[[ "${DATE1}" > "${DATE2}" ]]` – Jason Hu Aug 12 '15 at 21:25
  • you never executed `stat()`. you just defined a couple strings that happen to have the characters `(`, `s`, `t`, etc.. in them – Marc B Aug 12 '15 at 21:27
  • HuStmpHrrr, in local I agree that md5 is not needed (I will then taken off). Of course, if this process would run between two remote folders, in order to avoid time mismatch a md5 check (on local machines) would be required. Is it right? – giopas Aug 12 '15 at 23:21
  • Marc, you are right, thnx – giopas Aug 12 '15 at 23:22
  • For reference (I mentioned this below, but it bears repeating), *this is not a bash script*. You're calling it with `/bin/sh`, and while that may be a symlink or a hardlink to bash on some Linux distributions, when launched this way bash will function in "compatibility mode" and work like a plain old shell. – ghoti Aug 12 '15 at 23:32
  • This script will actually run on a NAS (QNAP), that's why I used /bin/sh – giopas Aug 13 '15 at 00:31
  • Ah, on my QNAP box, /bin/sh is indeed an old version of bash, compiled into busybox. For future reference, it'll be great to mention things like that in the question, as special conditions like that can have a huge effect on how something works. For example, using bash-isms in /bin/sh on anything OTHER platform is likely not to work at all. – ghoti Aug 13 '15 at 01:28
  • Do you have access to rsync? I feel like it would be a much simpler solution. – Mr. Llama Aug 13 '15 at 20:16
  • 1
    @Mr.Llama yes I have, but how to sync files 2 ways? – giopas Aug 13 '15 at 21:55

3 Answers3

4

I'm also going to suggest refactoring this, both to simplify the code, and to save your CPU cycles.

#!/bin/sh

# If both files exist....    
if [ -f "$1" -a -f "$2" ]; then

  # If they have the same content...
  if cmp "$1" "$2" >/dev/null 2>/dev/null; then
    echo "INFO: These two files are identical." >&2

  # If one is newer than the other...
  elif [ "$1" -nt "$2" ]; then
    if cp -p "$1" "$2"; then
      echo "INFO: Replaced file '$2' with '$1'." >&2
    else
      echo "ERROR: FAILED to replace file." >&2
      exit 1
    fi

  # If the other is newer than the one...
  elif [ "$1" -ot "$2" ]; then
    if cp -p "$2" "$1"; then
      echo "INFO: Replaced file '$1' with '$2'." >&2
    else
      echo "ERROR: FAILED to replace file." >&2
      exit 1
    fi

  else
    echo "ERROR: we should never reach this point. Something is wrong." >&2
    exit 1

  fi

else

  echo "ERROR: One of these files does not exist." >&2
  exit 1

fi

A few things that you may find useful.

  • This avoids calculating an md5 on each of the files. While comparing sums may be fine for small files like yours, it gets mighty expensive as your files grow. And it's completely unnecessary, because you have the cmp command available. Better to get in the habit of writing code that will work with less modification when you recycle it for the next project.
  • An if statement runs a command, usually [ or [[, but it can be any command. Here, we're running cmp and cp within an if, so that we can easily check the results.
  • This doesn't use stat anymore. While it's possible that you may never look beyond Linux, it's always a good idea to keep portability in mind, and if you can make your script portable, that's great.
  • This is not a bash script. Neither was your script -- if you call your script with /bin/sh, then you're in POSIX compatibility mode, which already makes this more portable than you thought. :-)
  • Indenting helps. You might want to adopt it for your own scripts, so that you can have a better visual idea of what commands are associated with the various conditions that are being tested.
ghoti
  • 45,319
  • 8
  • 65
  • 104
  • Thank you so much, ghoti. What does it mean the expressions "-a -f" and "-nt"? – giopas Aug 12 '15 at 23:15
  • @giopas, you can [`man test`](https://www.freebsd.org/cgi/man.cgi?query=test) to see the various options for the `[` command. In particular, `-a` means "and", so it joins `-f "$1"` and `-f "$2"`. The `-nt` option means "newer-than", so you can test whether one file is newer or older than another from right within the shell, with no need to call `stat` or any other external tool. – ghoti Aug 12 '15 at 23:30
  • Thank you ghoti, I have readapted your script to make it more clear (to me). One thing, testing the script (the version I made) on two dummy text files, I receive the following output: "test1 test2 differ: char 32, line 3 INFO: test2 replaced test1." It seems that the script works fine, but I wonder to understand how/why I get the first line. Can you help? – giopas Aug 13 '15 at 00:16
  • Ah, I see from your re-write that you want the comparison and copy to go in both directions. I've adapted my script to handle that for you, and added some comments. Re the error, that looks like the output of the `cmp` command. You should already be redirecting stderr, so perhaps on your system, cmp sends that output to stdout. Just redirect its output to /dev/null and you should be good to go. – ghoti Aug 13 '15 at 01:19
  • Yes, I just tested, and in fact the `cmp` command sends its errors to stdout instead of stderr. I've modified my answer to handle this for you, while remaining portable for other platforms. (We don't need any output from `cmp`.) – ghoti Aug 13 '15 at 01:32
0

What about something a bit simpler like the following?

#!/bin/sh
# select the two files from cli
# $1 = current file
# $2 = new file
FILE1=$1
FILE2=$2

# otherwise compare which of them has been last modified
DATE1=`(stat -c %Y $FILE1)`
DATE2=`(stat -c %Y $FILE2)`

if [ $DATE2 -gt $DATE1 ]; then
    echo "cp -f $FILE2 $FILE1"
#   cp -f $FILE2 $FILE1
fi
doktoroblivion
  • 428
  • 3
  • 14
-1

Almost there. Cleaning up your code and tweaking it a bit here is what I got

#!/bin/bash

# select the two files (default option)
FILE1="/dir1/file1.binary"
FILE2="/dir1/file2.binary"

# You can also pass the two file names as arguments for the script
if [ $# -eq 2 ]; then
    FILE1=$1
    FILE2=$2
fi

# create the hash of the two files
HASH1="$(md5sum $FILE1 | sed -n -e 's/^.*= //p')"
HASH2="$(md5sum $FILE2 | sed -n -e 's/^.*= //p')"

# get the dates of last modification
DATE1="$(stat -f '%m%t%Sm' $FILE1 | cut -c 1-10)"
DATE2="$(stat -f '%m%t%Sm' $FILE2 | cut -c 1-10)"

# Uncomment to see the values
#echo $FILE1 ' = hash: ' $HASH1 ' date: ' $DATE1
#echo $FILE2 ' = hash: ' $HASH2 ' date: ' $DATE2

# compare the two hashes
if [ $HASH1 == $HASH2 ]; then
    # if the two hashes are the same, exit
    echo "the two files are identical"
    exit 0
fi

# compare the dates
if [ $DATE1 -gt $DATE2 ]; then
    # if FILE1 is newer than FILE2, replace FILE2 with FILE1
    cp $FILE1 $FILE2
    echo "${FILE2} was replaced by ${FILE1}"
elif [ $DATE1 -lt $DATE2 ]; then
    # else if FILE2 is newer than FILE1, replace FILE1 with FILE2
    cp $FILE2 $FILE1
    echo "${FILE1} was replaced by ${FILE2}"
else
    # else the files are identical
    echo "the two files are identical"
fi

Your way of getting the date was wrong, at least on my machine. So I rewrote it.

Your hash string was wrong. You were effectively cropping the string to the first 32 characters. By using sed you can actually get rid of the first part of the command and simply store the result of the md5sum.

You also misused the conditional statements as HuStmpHrrr pointed out.

The rest is cosmetics.

user1472709
  • 133
  • 2
  • 14
  • Thanks, taking out the hash and name sections, it seems the right version of my script :) – giopas Aug 12 '15 at 23:24
  • This won't work. The `[[` is a bash-only test, and this is not a bash script. – ghoti Aug 12 '15 at 23:35
  • gothi, you are right about the nature of the script. In fact yours is syntactically better and theoretically faster (although the size of the files are is so small that the advantages are probably minimal). Nevertheless, a non-expert user your solution might find difficult to read and understand your version. I privileged readability to performance - all in all, I would simply prefer your version :) – user1472709 Aug 13 '15 at 00:11
  • Forgot to add that this solution works... maybe only as bash, but works. – user1472709 Aug 13 '15 at 00:21
  • Forgot to add that this solution works. Maybe works only as bash, but works as the user expects it. Therefore I don't think my solution deserves a down-vote, but fair enough... @gothi - In your solution you have forgot to do add the `if cp -p "$1" "$2"; then` block. – user1472709 Aug 13 '15 at 00:28
  • This confusion was indeed caused by the fact that we only learned this was bash on [qnap] late in the game. If you provide an answer with a bash script, it's always best to be explicit, and specify the correct shell in your shebang. (`#!/bin/bash` or better yet `#!/usr/bin/env bash`). The fact that qnap has a non-standard bash is a red herring. I can't remove the downvote unless your answer changes, so how about you add a clarification about the shell, and I'll remove the downvote? – ghoti Aug 13 '15 at 14:11