4

I have the following output.txt it consists only 2 columns to demonstrate:

Test1 Test1-IS-OK
Test2 Test2-IS-NOT
Test3 Test3-IS-OK
Test4 Test4-IS-OK
Test5 Test5-IS-NOT

Then my bash script has the following code:

#!/bin/bash
output="output.txt"
a=$(awk '{ print $1 }' $output)
b=$(awk '{ print $2 }' $output)

while IFS=" " read -r $a $b
do
    echo "LOG: $a and $b"
done < "$output"

I got the following error:

./test.sh: line 13: read: `Test1-IS-OK': not a valid identifier

I need to have output like this

LOG: Test1 and Test1-IS-OK
LOG: Test2 and Test2-IS-NOT
LOG: Test3 and Test3-IS-OK
LOG: Test4 and Test4-IS-OK
LOG: Test5 and Test5-IS-NOT

But the code is not working. What is the best method to loop this 2 columns from a file? Is there a simpler method?

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
Kalib Zen
  • 655
  • 1
  • 5
  • 16
  • 2
    FWIW I upvoted as you had input, output, code and a problem statement but the text associated with the bash tag (hover your mouse over it) specifically says `For shell scripts with errors/syntax errors, please check them with the shellcheck program (or in the web shellcheck server at https://shellcheck.net) before posting here.` and if you'd done that then shellcheck would've answered your question instead of you having to post it here so that may be causing you to get some down votes. – Ed Morton Oct 14 '20 at 00:19

4 Answers4

7

Best is to avoid bash and do this completely in awk. Within awk it is as simple as:

awk '{print "LOG:", $1, "and", $2}' file
LOG: Test1 and Test1-IS-OK
LOG: Test2 and Test2-IS-NOT
LOG: Test3 and Test3-IS-OK
LOG: Test4 and Test4-IS-OK
LOG: Test5 and Test5-IS-NOT
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Hey thanks for the tip on awk, but I prefer to use separate variable so I can pass that variable to other statement. upvoted your answer – Kalib Zen Oct 13 '20 at 19:22
  • 1
    `I prefer to use separate variable so I can pass that variable to other statement` _ may be if you clarify this then I can show how you can do that in awk itself. – anubhava Oct 13 '20 at 19:25
  • Sorry, I mean I need to use the variable `$a` and `$b` and they need to be declared inside that loop because the are other statements inside that loop will be using this variable `$a` and `$b`. Thank you so much for your effort to help me on this. Actually the answer given by @Hilton Fernandes is what I meant. – Kalib Zen Oct 13 '20 at 19:31
  • 4
    That answer is actually hugely error prone and a bad practice. [Read this carefully](https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice) – anubhava Oct 13 '20 at 19:34
  • 1
    Indeed, @anubhava ! awk is powerful, and gawk is even more powerful. It is a pity that it's not very used. – Hilton Fernandes Oct 14 '20 at 03:04
  • But sometimes you have no choice and need `awk` inside loop because you want to synchronize the result from awk for every changes. For example using `inotify: while inotifywait -q -e modify $some.txt > /dev/null; do your_awk_statement_on_some.txt done` You cannot put `your_awk_statement_on_some.txt` outside. – MaXi32 Oct 14 '20 at 13:51
3

Please consider transferring the awk parsing to the loop, where it belongs:

#!/bin/bash

output="output.txt"

while read -r line 
do
    a=$(echo "${line}" | awk '{print $1}')
    b=$(echo "${line}" | awk '{print $2}')
    echo "LOG: $a and $b"
done < "$output"

Edited according to a good suggestion by @EdMorton

Hilton Fernandes
  • 559
  • 6
  • 11
  • 3
    `awk` is a program that will loop through a file, so it belongs outside the loop. – Walter A Oct 13 '20 at 19:21
  • 2
    Indeed, but in the first version of the script, it would not generate the fields of each line as desired, but two lists of fields that require a lot more synchronization to parse in parallel. – Hilton Fernandes Oct 13 '20 at 19:34
  • 1
    I added a solution for a while-loop without calling `awk` inside the loop. – Walter A Oct 13 '20 at 19:40
  • If you copy/paste that script into http://shellcheck.net it'll tell you about some, but not all, of the issues with it. – Ed Morton Oct 13 '20 at 23:43
2

What are the problems with your code?

a=$(awk '{ print $1 }' $output)

With echo "a=${a}" you will see, that a is filled with the output for all lines. You were trying to make some find of function, to be called after $a.

while IFS=" " read -r $a $b

Now you are trying to call the "functions" a and b. The code will substitue the value of the variables before reading the inputfile. when a is filled with "Test1 Test2" the code will try to fill the fields $Test1 and $Test2.

When you only want to change the output, without passing the variables to another statement, you can use awk, or

sed -E 's/([^ ]*) ([^ ]*).*/LOG: \1 and \2/' $output
# or
printf 'LOG: %s and %s\n' $(<$output)

In your case, you can make read reading two parameters:

while read -r a b 
do
    echo "LOG: $a and $b"
done < "$output"
Walter A
  • 19,067
  • 2
  • 23
  • 43
  • Thanks, I accepted this answer because it explains what mistake I did in the code and the awk doesn't need to be inside loop for performance matter as mentioned by @anubhava – Kalib Zen Oct 13 '20 at 19:54
  • 2
    I deleted my incorrect `IFS=`. Please remember, that using `awk` inside a loop is wrong most of the times. – Walter A Oct 13 '20 at 20:00
1

Use this Perl one-liner:

perl -lane 'print "LOG: $F[0] and $F[1]";' output.txt > new.txt

The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array @F on whitespace or on the regex specified in -F option.

SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47