With your shown samples, please try following. Written and tested in GNU awk
should work in any awk
. This will create array named words
whose values could be accessed from index starting 1,2,3 and so on. I am printing it as an output, you can make use of it later on as per your wish too.
awk -F'=|"' -v s1="\"" '
{
gsub(/[A-Z]/,"\n&",$3)
val=(val?val ORS:"")$3
}
END{
num=split(val,words,ORS)
for(i=1;i<=num;i++){
if(words[i]!=""){
print "WORDS[" ++count "]=" s1 words[i] s1
}
}
}
' Input_file
Explanation: Adding detailed explanation for above awk
code.
awk -F'=|"' -v s1="\"" ' ##Starting awk program, setting field separator as = OR " and setting s1 to " here.
{
gsub(/[A-Z]/,"\n&",$3) ##Using gsub to globally substitute captial letter with new character and value itself in 3rd field.
val=(val?val ORS:"") $3 ##Creating val which has $3 in it and keep adding values in val itself.
}
END{ ##Starting END block of this program from here.
num=split(val,words,ORS) ##Splitting val into array arr with delmiter of ORS.
for(i=1;i<=num;i++){ ##Running for loop from value of 1 to till num here.
if(words[i]!=""){ ##Checking if arr item is NOT NULL then do following.
print "WORDS[" ++count "]=" s1 words[i] s1 ##Printing WORDS[ value of i followed by ]= followed by s1 words[i] value and s1.
}
}
}
' Input_file ##Mentioning Input_file name here.