-1

example file

aaa [bbb bb] ccc "ddd dd" eee
bbb [ccc cc] ddd "eee ee" fff

expected:

line1
s1="aaa", s2="bbb bb", s3="ccc", s4="ddd dd", s5="eee"
line2
s1="bbb", s2="ccc cc", s3="ddd", s5="eee ee", s5="fff"

Thanks in advance!

Alex Tang
  • 115
  • 1
  • 4
  • 12
  • What did you try for yourself? and what are the variable `s1` to `s5`. You want them stored in multiple variables? Why not an array – Inian Nov 30 '18 at 20:03
  • how about parenthesis, curly braces, single quotes? – karakfa Nov 30 '18 at 20:06
  • I tried AWK. awk -F" " '{print $1, $2, $3, $4, $5}' , not sure what to put delimiter -F – Alex Tang Nov 30 '18 at 20:07
  • It would help to use a standard file format, and a language that already has a parser for that format. This type of data processing really isn't what the shell is intended for. – chepner Nov 30 '18 at 20:09
  • agreed. I am a java guy however I have to use bash to parse this type of log file. Thank you – Alex Tang Nov 30 '18 at 20:12
  • You posted input and output, or do you want to have bash variables set that way? I find this question as unclear. Also, bash uses no `,` as a separator nowhere, you want to have a `,` character suffixed to all variables? – KamilCuk Nov 30 '18 at 20:31
  • This is not "space-delimited". – Paul Hodges Nov 30 '18 at 22:04

2 Answers2

1

Using gnu awk you may use this:

awk -v OFS=", " -v FPAT='\\[[^]]*\\]|"[^"]*"|[^[:space:]]+' '{
   for (i=1; i<=NF; i++) {
      gsub(/^[["]|[]"]$/, "", $i)
      $i = "s" i "=\"" $i "\""
   }
   $0 = "line" NR ORS $0
} 1' file

Output:

line1
s1="aaa", s2="bbb bb", s3="ccc", s4="ddd dd", s5="eee"
line2
s1="bbb", s2="ccc cc", s3="ddd", s4="eee ee", s5="fff"
anubhava
  • 761,203
  • 64
  • 569
  • 643
0

bash-only -

$: IFS=']"[' read -a line < infile # read the "groups"
$: line=( "${line[@]% }" )         # strip training spaces
$: line=( "${line[@]# }" )         # strip leading spaces

The line array now has your scrubbed data.

Shown in steps -

$: IFS=']"[' read -a line < infile
$: printf "[%s]\n" "${line[@]}"
[aaa ]
[bbb bb]
[ ccc ]
[ddd dd]
[ eee]
$: line=( "${line[@]% }" )
$: printf "[%s]\n" "${line[@]}"
[aaa]
[bbb bb]
[ ccc]
[ddd dd]
[ eee]
$: line=( "${line[@]# }" )
$: printf "[%s]\n" "${line[@]}"
[aaa]
[bbb bb]
[ccc]
[ddd dd]
[eee]
Paul Hodges
  • 13,382
  • 1
  • 17
  • 36