0

I'm trying to write a if-else condition in shell/bash script which will be used for many different files so it won't fit a certain structure.

I have three different files, and up to three different variables selected from each of these files which go into the if-else statement. In my script, I have this written (which probably could be written in a better way) at the beginning as follows:

ANC1=$(sed -n 1p file1 | cut -f 1 -d' ' )
ANC2=$(sed -n 2p file1 | cut -f 1 -d' ' )
ANC3=$(sed -n 3p file1 | cut -f 1 -d' ' )

ANC11=$(sed -n 1p file2 | cut -f 1 -d' ' )
ANC21=$(sed -n 2p file2 | cut -f 1 -d' ' )
ANC31=$(sed -n 3p file2 | cut -f 1 -d' ' )

ANC15=$(sed -n 1p file3 | cut -f 1 -d' ' )
ANC25=$(sed -n 2p file3 | cut -f 1 -d' ' )
ANC35=$(sed -n 3p file3 | cut -f 1 -d' ' )

As example, from these files, the following variables could have resulted:

echo ${ANC1}
FIN
echo ${ANC2}
NFE
echo ${ANC3}


echo ${ANC11}
FIN
echo ${ANC21}
NFE
echo ${ANC31}


echo ${ANC15}
FIN
echo ${ANC25}
NFE
echo ${ANC35}
SAS 

From here, I've written the if-else statement (taking into account the possible missing variables in three files as above). To make sense of it, the attempt is to do the following:

first condition: if all variables are not empty; second condition: if the third variable is the only missing variable; third condition: if the third and second variables are empty

if [ "${ANC3}" != "" ] || [ "${ANC31}" != "" ] || [ "${ANC35}" != "" ]; then

    echo "***** three variables *****"

    bcftools merge -m both \
    fileref1.genotypes_${ANC1}.vcf.gz \
    fileref1.genotypes_${ANC2}.vcf.gz \
    fileref1.genotypes_${ANC3}.vcf.gz \
    -Oz \
    -o fileref1.new.genotypes_${ANC1}.${ANC2}.${ANC3}.vcf.gz

    bcftools merge -m both \
    fileref2.genotypes_${ANC11}.vcf.gz \
    fileref2.genotypes_${ANC21}.vcf.gz \
    fileref2.genotypes_${ANC31}.vcf.gz \
    -Oz \
    -o fileref2.new.genotypes_${ANC11}.${ANC21}.${ANC31}.vcf.gz

    bcftools merge -m both \
    fileref3.genotypes_${ANC15}.vcf.gz \
    fileref3.genotypes_${ANC25}.vcf.gz \
    fileref3.genotypes_${ANC35}.vcf.gz \
    -Oz \
    -o fileref1.new.genotypes_${ANC15}.${ANC25}.${ANC35}.vcf.gz

elif 
    [ "${ANC3}" == "" -a "${ANC2}" != "" ] || [ "${ANC31}" == "" -a "${ANC21}" != "" ] || [ "${ANC35}" == "" -a "${ANC25}" != "" ]; then

    echo "***** two variables *****"

    bcftools merge -m both \
    fileref1.genotypes_${ANC1}.vcf.gz \
    fileref1.genotypes_${ANC2}.vcf.gz \
    -Oz \
    -o fileref1.new.genotypes_${ANC1}.${ANC2}.vcf.gz

    bcftools merge -m both \
    fileref2.genotypes_${ANC11}.vcf.gz \
    fileref2.genotypes_${ANC21}.vcf.gz \
    -Oz \
    -o fileref2.new.genotypes_${ANC11}.${ANC21}.vcf.gz

    bcftools merge -m both \
    fileref3.genotypes_${ANC15}.vcf.gz \
    fileref3.genotypes_${ANC25}.vcf.gz \
    -Oz \
    -o fileref1.new.genotypes_${ANC15}.${ANC25}.vcf.gz

elif 
    [ "${ANC3}" == "" -a "${ANC2}" == "" ] || [ "${ANC31}" == "" -a "${ANC21}" == "" ] || [ "${ANC35}" == "" -a "${ANC25}" == "" ]; then 

    echo "***** one variable ***** "

    cp fileref1.genotypes_${ANC1}.vcf.gz fileref1.new.genotypes_${ANC1}.${ANC2}.vcf.gz

    cp fileref2.genotypes_${ANC11}.vcf.gz fileref2.new.genotypes_${ANC11}.vcf.gz

    cp fileref3.genotypes_${ANC15}.vcf.gz fileref1.new.genotypes_${ANC15}.vcf.gz

fi

Every time I run this script, 3 files are supposed to be produced, but sometimes this is not the case. The first part works (for the files where all variables are not empty) - but the second and third conditions don't seem to. I've also tried [ -z "${ANC3}" ] and [ -n "${ANC2}" ] to indicate missing and non missing, respectively but this also did not work. Also tried [[ ]] compared [ ] but still the same.

Anything that I'm obviously missing?

zx8754
  • 52,746
  • 12
  • 114
  • 209
joeblow
  • 13
  • 6
  • Are the variables truly empty, or do they contain whitespace characters? – Kusalananda May 12 '18 at 19:10
  • 1
    We almost certainly have a duplicate of your question, but it's so hidden inside of code that has nothing to do with `if` statements and empty variables that it's almost impossible to tell. Please try to follow the rules regarding building a [mcve] -- the *shortest possible code* someone else can run to see the problem themselves, with actual and intended output clearly distinguished -- for questions that involve code. – Charles Duffy May 12 '18 at 19:50
  • 2
    BTW, there are much, **much** more efficient ways to read fields from your first three lines into variables than a bunch of `sed` calls. `{ read anc1 _; read anc2 _; read anc3 _; } – Charles Duffy May 12 '18 at 19:51
  • 1
    You don't need `sed` here. For example, `{ read ANC1 _; read ANC2 _; read ANC3 _; } < file1`. – chepner May 12 '18 at 19:52
  • BTW, note also that `==` as a test operator is a nonstandard extension (the standard string comparison operator is `=`), and using `-a` or `-o` to combine multiple tests is marked obsolescent (see the `[OB XSI]` markers in http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html). And all-caps names are used by variables meaningful to the shell, whereas names with at least one lowercase character are guaranteed not to impact shell operation; see http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html, fourth paragraph. – Charles Duffy May 12 '18 at 19:54
  • ...so, if your question just had two lines of code, one that set up your variables to a specific value where your `if` doesn't work the way you expect, and a second line with that `if`, and asked why the `if` branches one way instead of the other... that would be *vastly* easier to follow (and anyone could run it themselves, even on an online interpreter, without needing to create `file1` and `file2` and so forth). Also, putting the assignments in the question with explicit literals excludes surprises like file fields that *look* empty but have nonprintable characters. – Charles Duffy May 12 '18 at 19:58

2 Answers2

1

I'm not sure I understand how the logic is supposed to work, but I think you might be having trouble with De Morgan's laws, which have to do with how logical negation combine with ANDs and ORs. English tends to be pretty sloppy about this, so you have to think carefully when you translate what you want into code logic. Specifically, you said "first condition: if all variables are not empty", but the corresponding if statement:

if [ "${ANC3}" != "" ] || [ "${ANC31}" != "" ] || [ "${ANC35}" != "" ]; then

..actually corresponds to "if ANY of the variables is not empty".

In the example you gave, ANC3 and ANC31 are both empty (so the first two test come back as false), and ANC35 is not empty (it's "SAS"), so the third test is true. false || false || true evaluates to true, so that if condition as a whole is true, and that branch of the if statement will be executed. Is that what's supposed to happen with only one of the variables being nonempty?

If I'm right about the problem, then the first if statement should have &&s instead of ||s, like this:

if [ "${ANC3}" != "" ] && [ "${ANC31}" != "" ] && [ "${ANC35}" != "" ]; then

There may also be similar problems with the elif tests, but as I said I'm not sure I understand what it's supposed to do properly.

Gordon Davisson
  • 118,432
  • 16
  • 123
  • 151
  • Technically you are right, and that is what I had first time round. However, the reason why I changed it to `||` from `&&` is because it applies to all three files (`file1`, `file2`, `file3`) as `and/or`. That is, for example, for the first condition, its not `if all three variables are not empty in all three files then do stuff`. Its `if all three variables are not empty in any of the 3 files, then do stuff in the files where the three variables are not empty`. That is the logic in the following 2 conditions as well. Does that make sense? Apologies for making this convoluted. – joeblow May 13 '18 at 15:24
1

This is not a complete answer, but here are simple POSIX shell examples, given variables $x, $y, and $z:

first condition: if all variables are not empty;

[ "$x" -a "$y" -a "$z" ] && do_stuff

second condition: if the third variable is the only missing variable;

[ "$x" -a "$y" -a ! "$z" ] && do_stuff

third condition: if the third and second variables are empty

[ "$y$z" ] || do_stuff
agc
  • 7,973
  • 2
  • 29
  • 50