1

I have a file built from a grep output and it looks like this :

http://google.fr
Pierre google
http://test.fr
--
http://yahoo.com
Jean Yahoo
http://test.fr
--

I made a separator '--' for every 3 lines. I would like to assign every line to a variable, for example :

url= "http://google.fr"
name= "Pierre Google"
web= "http://test.fr"

So I made the bash script with IFS=-- and I have tried with the -d option for echo but i don't know how I could assign these 3 lines to a variable for every block.

Thanks for your help

Daniel R
  • 17
  • 3
  • `IFS` is an unordered collection of characters; `--` is exactly the same as `-`. Similarly, `-d`'s subsequent argument is a *single* character. To explain **exactly** why the `read` you used didn't work, we'd need to see it, but it's also hard to see how a single `read` would be the right job here. (I suppose one could use `IFS=$'\n' read -r -d - -a pieces` or such, and just accept that every other result would be empty). – Charles Duffy Mar 20 '17 at 14:42
  • BTW, `url= "http://google.fr"` actually runs `http://google.fr` as a command with `url` exported to the environment with an empty value. An assignment as such would be `url="http://google.fr"`, without the space. – Charles Duffy Mar 20 '17 at 14:51

3 Answers3

4

With a bit of error-handling, this might look like:

while read -r url && read -r name && read -r web; do
  echo "Read url of $url, name of $name, and web of $web"
  read -r sep || { true; break; } # nothing to read: exit loop w/ successful $?
  if [[ $sep != -- ]]; then
    printf 'Expected separator, but saw: %q\n' "$sep" >&2
    false; break # "--" not seen where expected; exit loop w/ $? indicating failure
  fi
done <in.txt

See BashFAQ #1.

(By the way -- if you don't want leading and trailing whitespace stripped, I would suggest clearing IFS with IFS= -- either scoped to the reads as in while IFS= read -r url && IFS= read -r name && IFS= read -r web, or global to the script if there's nothing else going on where the side effects would be undesired).

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
1

You can preprocess the file into a more typical format to use IFS to separate into fields (or variables) with a utility such as awk or sed:

while IFS="|" read -r url name web; do
    echo "$url" "$name" "$web"
done  < <(awk 'BEGIN{RS="--\n"; FS="\n"; OFS="|"} {print $1,$2,$3}' file)

That preserves leading and trailing white spaces on each line.

If you want to strip leading and trailing white spaces, remove the IFS="|" and the OFS="|" part so that Bash strips the lines:

while read -r url name web; do
    echo "$url" "$name" "$web"
done  < <(awk 'BEGIN{RS="--\n"; FS="\n"} {print $1,$2,$3}' file)
dawg
  • 98,345
  • 23
  • 131
  • 206
0

This should work for your case

    NR=0
    while read LINE; do
        ((NR++))
        case $NR in
            1)
                URL=$LINE
                    ;;
            2)
                NAME=$LINE
                    ;;
            3)
                WEB=$LINE
                    ;;
            4)
                if [ $LINE = "--" ]; then
                    NR=0
                    #
                    # DO WHAT EVER YOU WANT TO DO WITH YOUR DATA HERE
                    #
                    #
                    echo "$URL;;$NAME;;$WEB"
                else
                    echo "wrong delimiter found \"$LINE\""
                fi
                    ;;
        esac
    done

run it with

script.sh < inputfile.txt

or just pipe the output of your grep command to the script.

Mario Keller
  • 411
  • 2
  • 5
  • 8 minutes too slow :-) – Mario Keller Mar 20 '17 at 14:52
  • A few notes. First -- all-caps variable names are reserved for variables with meaning to the operating system or shell; application-defined variables should have at least one lower-case character in their names. See relevant POSIX guidelines at http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html, keeping in mind that shell variables and environment variables share a namespace. – Charles Duffy Mar 20 '17 at 14:52
  • Second -- you're quoting exactly the wong thing in `[ $LINE = "--" ]`; the `--` string can only ever evaluate to itself if unquoted, but `$LINE` unquoted can become zero or more words. For instance, if you had a line containing `*`, that would be replaced with a list of filenames, so you'd have something like `[ a.txt b.txt = -- ]` run, which isn't valid `test` syntax and would produce an error on stderr. Similarly, `[ = -- ]` (which would result in an empty line) would produce an error because `=` isn't valid as a unary operator. `[ "$LINE" = -- ]` would avoid this concern. – Charles Duffy Mar 20 '17 at 14:53
  • (Also, I'd suggest using the `-r` argument to `read` unless you have a specific reason not to: Without `-r`, backslashes in the input will be interpreted as line continuations, so if you had a name ending in a backslash, then you'd have the web address on the following line returned by the same `read` as the name, thus throwing off the counting altogether). – Charles Duffy Mar 20 '17 at 14:56
  • 1
    Oh -- and use `((++NR))` instead of `((NR++))` if you don't want [surprises](http://mywiki.wooledge.org/BashFAQ/105) when your code is run with `set -e`. – Charles Duffy Mar 20 '17 at 15:00
  • Thanks for the input. My solution was more or less a quick hack. should have put more effort in the script. – Mario Keller Mar 20 '17 at 15:02