1

I have a multiline string and I want to split is based on \n for which I'm setting IFS=$'\n' but it's not working, I've also tried IFS= and IFS=" " but no luck. Below is the sample code

IFS=$'\n' read -ra arr <<< "line1\nline2\nline3"
printf "%s\n" "${arr[@]}"
# it only generates array of length 1

I'm using Ubuntu with bash version being

GNU bash, version 4.4.19(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Can someone point out the obvious.

DDStackoverflow
  • 505
  • 1
  • 11
  • 26
  • The `printf` would produce the same result if the command succeeded. How exactly are you determining the size of the array? – tripleee Dec 03 '18 at 06:13
  • You are providing a quoted string, e.g. `'line1\nline2\nline3'`, word-splitting will not occur. – David C. Rankin Dec 03 '18 at 06:15
  • @tripleee by `echo ${#arr[@]}` results in `1`. – DDStackoverflow Dec 03 '18 at 06:21
  • @DavidC.Rankin , edited the question still same result, not getting the split. – DDStackoverflow Dec 03 '18 at 06:22
  • 1
    The problem is you are not actually reading input, you are providing a string, whether single-quoted or double-quoted, word-splitting will not occur. Your default `IFS` is `" \t\n"` (`space tab newline`), so there is no need to alter `IFS`. This is one of those quirks on how the shell handles a literal. Further, you are only reading once. You seem to be wanting to use `readarray` or `mapfile` rather than `read -a`. You may also be also be confusing `-a` with `-a -d '\n'` where you want to specify the delimiters are `'\n'`. – David C. Rankin Dec 03 '18 at 06:27
  • Switching to double quotes doesn't change the fact (if anything, the opposite). You apparently want `<<<$'string'` – tripleee Dec 03 '18 at 06:28
  • Hint: Try `cat <<<"no\nnewlines\nhere"` – tripleee Dec 03 '18 at 06:30
  • There are some defects in your script including others have commented: - Even if you surround the string with double quotes, "line1\nline2\nline3" is not interpreted as newline-separated lines. - `read` is a line oriented command and it exits after a line is read. This means that you cannot use `-a` option when reading a `newline` separated text. – tshiono Dec 03 '18 at 06:31
  • From what I can tell you actually want `readarray -t arr < <(printf "line1\\nline2\\nline3\\n"); declare -p arr` which results in `declare -a arr='([0]="line1" [1]="line2" [2]="line3")'` (you can use just `'\n'` instead of the POSIX `'\\n'` in the `printf` string) – David C. Rankin Dec 03 '18 at 06:32

2 Answers2

3

The primary problem with your attempt is word-splitting is not preformed with a herestring. From man bash:

Here Strings
   A variant of here documents, the format is:

          <<<word

   ... Pathname expansion and word splitting are not performed.
   The result is supplied as a single string to the command on its 
   standard input.

Bash does provide a heredoc (e.g. "Here Document" in man bash) on which word-splitting will be preformed with the default IFS. However, even then you will still read the literal '\n' as part of the contents of the array. Not to fear, bash has provided a specific manner in which this can be avoided with the -t option to readarray (a/k/a mapfile).

A short example as close as I can get to your original attempt would be:

readarray -t arr << EOF
line1
line2
line3
EOF
declare -p arr

Which results in your lines being saved as desired, e.g. the output would be:

declare -a arr='([0]="line1" [1]="line2" [2]="line3")'

The other alternative is to use process substitution and let printf provide the splitting, e.g.

readarray -t arr < <(printf "line1\nline2\nline3\n")

The key to filling the array without the newlines included is readarray -t, the key to allowing word-splitting to occur is to avoid the herestring.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
1

from bash manual, bash builtins section:

read

read [-ers] [-a aname] [-d delim] [-i text] [-n nchars]
    [-N nchars] [-p prompt] [-t timeout] [-u fd] [name …]

One line is read from the standard input, or from the file descriptor fd supplied as an argument to the -u option, 

and the line delimiter can be specified with -d

-d delim

The first character of delim is used to terminate the input line, rather than newline.

Note that followed by an empty argument -d '', the line delimiter will be the nul character, if there is no nul character in input, it can be used to read the whole input.

However read is slow with even with -r mode.

For larger input a faster solution can be to use word splitting:

input=$'line1\nline2\nline3'
IFS=$'\n';set -f;arr=($input);set +f;IFS=$' \t\n'

note set -f to avoid glob matching phase.

Nahuel Fouilleul
  • 18,726
  • 2
  • 31
  • 36