2

I have isolated the problem to the below code snippet:

  1. Notice below that null string gets assigned to LATEST_FILE_NAME='' when the script is run using ksh; but the script assigns the value to variable $LATEST_FILE_NAME correctly when run using sh. This in turn affects the value of $FILE_LIST_COUNT.
  2. But as the script is in KornShell (ksh), I am not sure what might be causing the issue.
  3. When I comment out the tee command in the below line, the ksh script works fine and correctly assigns the value to variable $LATEST_FILE_NAME.
(cd $SOURCE_FILE_PATH; ls *.txt 2>/dev/null) | sort -r > ${SOURCE_FILE_PATH}/${FILE_LIST} | tee -a $LOG_FILE_PATH

Kindly consider:

1. Source Code: script.sh

#!/usr/bin/ksh
set -vx # Enable debugging

SCRIPTLOGSDIR=/some/path/Scripts/TEST/shell_issue
SOURCE_FILE_PATH=/some/path/Scripts/TEST/shell_issue
# Log file
Timestamp=`date +%Y%m%d%H%M`
LOG_FILENAME="TEST_LOGS_${Timestamp}.log"
LOG_FILE_PATH="${SCRIPTLOGSDIR}/${LOG_FILENAME}"
## Temporary files
FILE_LIST=FILE_LIST.temp    #Will store all  extract filenames
FILE_LIST_COUNT=0           # Stores total number of  files

getFileListDetails(){
    rm -f $SOURCE_FILE_PATH/$FILE_LIST 2>&1 | tee -a $LOG_FILE_PATH

    # Get list of all files, Sort in reverse order, and store names of the  files line-wise. If no files are found, error is muted.
    (cd $SOURCE_FILE_PATH; ls *.txt 2>/dev/null) | sort -r > ${SOURCE_FILE_PATH}/${FILE_LIST} | tee -a $LOG_FILE_PATH

    if [[ ! -f $SOURCE_FILE_PATH/$FILE_LIST ]]; then
        echo "FATAL ERROR - Could not create a temp file for  file list.";exit 1;
    fi

    LATEST_FILE_NAME="$(cd $SOURCE_FILE_PATH; head -1 $FILE_LIST)";
    FILE_LIST_COUNT="$(cat $SOURCE_FILE_PATH/$FILE_LIST | wc -l)";

}

getFileListDetails;
exit 0;

2. Output when using shell sh script.sh:

+ getFileListDetails
+ rm -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300506.log
+ cd /some/path/Scripts/TEST/shell_issue
+ sort -r
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300506.log
+ ls 1.txt 2.txt 3.txt
+ [[ ! -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp ]]
cd $SOURCE_FILE_PATH; head -1 $FILE_LIST
++ cd /some/path/Scripts/TEST/shell_issue
++ head -1 FILE_LIST.temp
+ LATEST_FILE_NAME=3.txt
cat $SOURCE_FILE_PATH/$FILE_LIST | wc -l
++ cat /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
++ wc -l
+ FILE_LIST_COUNT=3
exit 0;
+ exit 0

3. Output when using ksh ksh script.sh:

+ getFileListDetails
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300507.log
+ rm -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
+ 2>& 1
+ tee -a /some/path/Scripts/TEST/shell_issue/TEST_LOGS_201304300507.log
+ sort -r
+ 1> /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
+ cd /some/path/Scripts/TEST/shell_issue
+ ls 1.txt 2.txt 3.txt
+ 2> /dev/null
+ [[ ! -f /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp ]]
+ cd /some/path/Scripts/TEST/shell_issue
+ head -1 FILE_LIST.temp
+ LATEST_FILE_NAME=''
+ wc -l
+ cat /some/path/Scripts/TEST/shell_issue/FILE_LIST.temp
+ FILE_LIST_COUNT=0
exit 0;+ exit 0
javaPlease42
  • 4,699
  • 7
  • 36
  • 65
Kent Pawar
  • 2,378
  • 2
  • 29
  • 42
  • 1
    zero output almost certainly means either an empty file OR an empty variable, being evaluated as an empty file. Triple-check your spelling of variables? More helpful will be to turn on shell debugging with `set -vx`. You will see each line before it is executed, and then the line as it is executed WITH the variables expanded to their values. Look there to make sure everything is working as you expect. Good luck. – shellter Apr 18 '13 at 01:12
  • Thanks @shellter! I will try using the `set -vx` to debug it – Kent Pawar Apr 18 '13 at 02:41
  • How did it go with the debugging? – Adrian Frühwirth Apr 29 '13 at 19:20
  • Hi @AdrianFrühwirth , shellter - I debugged the code, ran it using different shell interpreters and finally isolated the problem and have reproduced it above.. :) Thanks – Kent Pawar Apr 30 '13 at 09:46
  • 1
    Before going into detail about some things that are wrong with your snippet, what is your higher goal and why do you think you need to have a reversed list of files? What do you want to do with that? – Adrian Frühwirth Apr 30 '13 at 11:24
  • The requirement is that a file will be SFTP-ed to the target server on a daily basis.. The filename will be a timestamp, of the time the file was generated on the source server. The above script needs to pick up the latest file based on the file name (so we sort in reverse order) and then process the latest file and rest of the files accordingly. So I need the files names to be stored in a file in reverse timestamp order. The problem area is just this statement: ` (cd $SOURCE_FILE_PATH; ls *.txt 2>/dev/null) | sort -r > ${SOURCE_FILE_PATH}/${FILE_LIST} | tee -a $LOG_FILE_PATH` – Kent Pawar Apr 30 '13 at 13:29
  • ...There may be better ways to handle the above requirement (which I would be happy to hear!), but the problem area is just this statement: `(cd $SOURCE_FILE_PATH; ls *.txt 2>/dev/null) | sort -r > ${SOURCE_FILE_PATH}/${FILE_LIST} | tee -a $LOG_FILE_PATH` .. that behaves differently in sh and ksh. The subtle differences in the way the syntax is parsed *might* be the root cause and knowing this would help me as a beginner.. – Kent Pawar Apr 30 '13 at 13:42

1 Answers1

3

OK, here goes...this is a tricky and subtle one. The answer lies in how pipelines are implemented. POSIX states that

If the pipeline is not in the background (see Asynchronous Lists), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete.)

Notice the keyword may. Many shells implement this in a way that all commands need to complete, e.g. see the manpage:

The shell waits for all commands in the pipeline to terminate before returning a value.

Notice the wording in the manpage:

Each command, except possibly the last, is run as a separate process; the shell waits for the last command to terminate.

In your example, the last command is the tee command. Since there is no input to tee because you redirect stdout to ${SOURCE_FILE_PATH}/${FILE_LIST} in the command before, it immediately exits. Oversimplified speaking, the tee is faster than the earlier redirection, which means that your file is probably not finished writing to by the time you are reading from it. You can test this (this is not a fix!) by adding a sleep at the end of the whole command:

$ ksh -c 'ls /tmp/* | sort -r > /tmp/foo.txt | tee /tmp/bar.txt; echo "[$(head -n 1 /tmp/foo.txt)]"'
[]

$ ksh -c 'ls /tmp/* | sort -r > /tmp/foo.txt | tee /tmp/bar.txt; sleep 0.1; echo "[$(head -n 1 /tmp/foo.txt)]"'
[/tmp/sess_vo93c7h7jp2a49tvmo7lbn6r63]

$ bash -c 'ls /tmp/* | sort -r > /tmp/foo.txt | tee /tmp/bar.txt; echo "[$(head -n 1 /tmp/foo.txt)]"'
[/tmp/sess_vo93c7h7jp2a49tvmo7lbn6r63]

That being said, here are a few other things to consider:

  1. Always quote your variables, especially when dealing with files, to avoid problems with globbing, word splitting (if your path contains spaces) etc.:

    do_something "${this_is_my_file}"

  2. head -1 is deprecated, use head -n 1

  3. If you only have one command on a line, the ending semicolon ; is superfluous...just skip it

  4. LATEST_FILE_NAME="$(cd $SOURCE_FILE_PATH; head -1 $FILE_LIST)"

    No need to cd into the directory first, just specify the whole path as argument to head:

    LATEST_FILE_NAME="$(head -n 1 "${SOURCE_FILE_PATH}/${FILE_LIST}")"

  5. FILE_LIST_COUNT="$(cat $SOURCE_FILE_PATH/$FILE_LIST | wc -l)"

    This is called Useless Use Of Cat because the cat is not needed - wc can deal with files. You probably used it because the output of wc -l myfile includes the filename, but you can use e.g. FILE_LIST_COUNT="$(wc -l < "${SOURCE_FILE_PATH}/${FILE_LIST}")" instead.

Furthermore, you will want to read Why you shouldn't parse the output of ls(1) and How can I get the newest (or oldest) file from a directory?.

Adrian Frühwirth
  • 42,970
  • 10
  • 60
  • 71