I made a script to search a huge csv file and find the desired fields to print, store them into an array, and then passed this array into awk to print them thanks to an answer found here:
awk -F, \
-v _sourcefile="$i" \
-v title="\"${k}\"" \
-v box="_${j}_" \
-v score="$dock_score_column" \
-v xp_terms_columns="${xp_terms_columns[*]}" \
'
BEGIN {
nxp = split(xp_terms_columns,xp," ")
nfmt = split("%-8s %s %9s %9s %8s %10s %7s %10s %16s %14s %9s %11s %9s %9s %12s %7s %6s %7s",fmt," ")
}
($title_column ~ title) && ($source_column ~ _sourcefile) && ($source_column ~ box) {
if ( $xp[6] ~ / / ) { # <-- Here's the problem
printf "%-8s", $score
exit 0
}
else {
}
($title_column ~ title) && ($source_column ~ _sourcefile) && ($source_column ~ box)
{
printf "%-8s =", $score
for ( i=1; i<=nxp; i++ ) {
printf ("%s" fmt[i]), OFS, $(xp[i])
}
print ""
}
' "$file"
The problem is that the if statement should prevent empty or blank fields from being printed, and jump to the print $score step ignoring the else. To explain more easily i'll put down a test output:
#Desired output
XP HBond + Electro + PhobEn + PhobEnHB + LowMW + RotPenal + LipophilicEvdW + PhobEnPairHB + Sitemap + Penalties + PiStack + HBPenal + ExposPenal + PiCat + ClBr + Zpotr
-9,473 = -0,953 -1,133 -2,700 0,000 0,000 0,211 -3,298 0,000 -1,600 0,000 0,000 0,000 0,000 0,000 0,000 0,000
#Desired output if one or more indexes are blanks
XP HBond + Electro + PhobEn + PhobEnHB + LowMW + RotPenal + LipophilicEvdW + PhobEnPairHB + Sitemap + Penalties + PiStack + HBPenal + ExposPenal + PiCat + ClBr + Zpotr
-9,473
This is what the else statement should print IF all the elements in xp exist, unfortunately if the csv has the fields but the values in the records are empty (i.e. "") the script will print something like:
XP HBond + Electro + PhobEn + PhobEnHB + LowMW + RotPenal + LipophilicEvdW + PhobEnPairHB + Sitemap + Penalties + PiStack + HBPenal + ExposPenal + PiCat + ClBr + Zpotr
-7,897 = 0,000
Instead i want it to print only the first number, stored in $score. How can i check whether ALL the arrays' indexes/values exist and are not blank/empty?
EDIT
To make it simpler to explain my problem, i've made a (fake) $file that can be used for testing out possible solutions:
Title,Score,Score-1,Score-2,Score-3,Score-4
foo,4.9,1.2,,,,
This kind of file give me trouble because although no value is specified in record 2 and field 4, 5 or 6, awk still prints out an (empty) line, and stores it inside the array xp. Trying it out by modifying the printf format i found out that the element in the array is empty, but still populated. Also checking it out with nxp it still finds 5 different elements although only 2 numbers are displayed in the example file.
EDIT-2
I wish my real file was this short and simple, i've made an entire other script with the sole purpose of filtering it out and only finding the fields in a specific order so no possible way to print from field a to field b without a problem.