Possible to use for..in loop in bash to iterate over lines of process output?

Question

Consider this example:

cat > test.txt <<EOF
hello world
hello bob
super world
alice worldview
EOF

# using cat to simulate another command output piping;
# get only lines that end with 'world'

fword="world"
for line in "$(cat test.txt | grep " ${fword}\$")"; do
  echo "for line: $line"
done

echo "-------"

while read line; do
  echo "while line: $line"
done <<< "$(cat test.txt | grep " ${fword}\$")"

The output of this script is:

for line: hello world
super world
-------
while line: hello world
while line: super world

So, basically, the process substitution in the for ... in loop, ended up being compacted in a single string (with newlines inside) - which for ... in still sees as a single "entry", and so it loops only once, dumping the entire output.

The while loop, on the other hand, uses the "classic" here-string - and even with the same quoting of the process substitution (that is, "$(cat test.txt | grep " ${fword}\$")"), the here-string ends up serving lines one-by-one to the while, so it loops as expected (twice in this example).

Could anyone explain why this difference happens - and if it is possible to "massage" the formatting of the for .. in loop, so it also loops correctly like the while loop?

( It is much easier for me to parse what is going on in the for .. in syntax, so I'd love to be able to use it, to run through loops like these (built out of results of pipelines and process substitution) - so that is why I'm asking this question. )

You have the entire command in quotes in your `for` loop, so it comes across as one entry. — Tim Roberts, Jan 19 '22 at 21:33
Thanks, @TimRoberts, but: 1. those exact same quotes are used as the input for the here-string of while, and in that case, it is not a problem; 2. if I remove the quotes on the `for..in`, I don't get splitting at each line, as I'd expect - but at each word (which I don't need) — sdbbs, Jan 19 '22 at 21:40
It is the "read" command that splits the file into lines. And yes, the `for` command splits into words, not into lines. The `while read` loop is the better solution for line-oriented stuff. — Tim Roberts, Jan 19 '22 at 21:44
[Why you don't read lines with "for"](http://mywiki.wooledge.org/DontReadLinesWithFor) — glenn jackman, Jan 19 '22 at 21:52

score 1 · Answer 1 · answered Jan 19 '22 at 22:16

why this difference happens

read (not while) reads input line by line. So any input is read by read up until a newline character, then while loops it.

for iterates for words, and "anything" (except "$@" and "${array[@]}") is always going to be one word. There is one word.

if it is possible to "massage" the formatting of the for .. in loop, so it also loops .. like the while loop?

Unquoted expansion undergoes word splitting expansion, where the result is separated using characters in IFS into words. So you can set IFS to a newline, and the text will be split on newlines into words.

IFS=$'\n'
for i in $(grep " ${fword}\$" test.txt); do

loops correctly

This is all not correct.

Unquoted expansion undergoes word splitting and filename expansion. Any text with * ? [ will be replaced by words of matching filenames (or not, if no matches).

read without -r removes \ from the input, and with default IFS removes trailing and leading newlines from the input.

It is much easier for me to parse

But it is just not correct. for i in $(...) is a common anti- pattern - you should not use it. Executing a command and then storing the whole output of it and then splitting it is expensive. While it is fine for small files, it may bite when parsing logs. Usually you want to parse the command at the same time as it is running - think in pipelines. I.e. <<<"$(stuff)" is an antipattern, it's better to do stuff |.

Get used to the while syntax and to pipes:

grep " ${fword}\$" test.txt |
while IFS= read -r line; do
  echo "while line: $line"
done

Or in Bash with process substitution:

while IFS= read -r line; do
  echo "while line: $line"
done < <(grep " ${fword}\$" test.txt)

Or one step at a time, but memory consuming:

tmp=$(grep " ${fword}\$" test.txt)
while IFS= read -r line; do
   echo "while line: $line"
done <<<"$tmp"

See https://mywiki.wooledge.org/BashFAQ/001 (and if going with pipes, see https://mywiki.wooledge.org/BashFAQ/024 ). Check your script with shellcheck - he will catch most mistakes.

Possible to use for..in loop in bash to iterate over lines of process output?

1 Answers1