-3

In file shell I have the following:

Looks like:

BCTS1
                             ,07/09/2021                    ,
        09:06:26                      ,09:09:26                      ,
        0 horas con 3 minutos

I would like it to look like:

BCTS1,07/09/2021,09:06:26,09:09:26,0 horas con 3 minutos

Several breaklines and blank spaces.

Can anyone help?

  • There seems to be a verb missing in this sentence: “_Several breaklines and blank spaces._” – Maëlan Sep 08 '21 at 20:32
  • Can newlines occur with fields? For example could you have `0 horascon 3 minutos` in the input? If so how should those be handled - newlines removed or replaced with blanks or something else? – Ed Morton Sep 09 '21 at 13:16

2 Answers2

0

Given:

cat file
BCTS1
                ,07/09/2021                    ,
    09:06:26                      ,09:09:26                      ,
    0 horas con 3 minutos

Your easiest substitution is with Perl:

perl -0777 -pe 's/\s*,\s*/,/g' file
BCTS1,07/09/2021,09:06:26,09:09:26,0 horas con 3 minutos

Or, you can use cat tr and sed:

cat file | tr -d '\n' | sed 's/[[:space:]]*,[[:space:]]*/,/g'
# same output

Or with any POSIX awk:

cat file | tr -d '\n' | awk '{gsub(/[[:space:]]*,[[:space:]]*/,",")} 1'

With GNU sed:

sed -Ez 's/\s*,\s*/,/g' file
dawg
  • 98,345
  • 23
  • 131
  • 206
  • 1
    You can go with a single `sed` too: `sed -Ez 's/\s*,\s*/,/g'` (I suspect that line-buffering might misbehave for large files, but other solutions are equally suspect, except maybe Perl which I’m not familiar enough with). – Maëlan Sep 08 '21 at 20:44
  • The `cat file |` solutions have a UUOC and the `tr -d '\n'` convert the input into something that's no longer a valid POSIX test file so YMMV with what any subsequent text-processing tool does with that. – Ed Morton Sep 09 '21 at 12:57
0

Using any POSIX awk

$ awk -v RS= -F'[[:space:]]*,[[:space:]]*' -v OFS=',' '{$1=$1}1' file
BCTS1,07/09/2021,09:06:26,09:09:26,0 horas con 3 minutos

and if you don't have a POSIX awk (for the [:space:] character class) then:

$ awk -v RS= -F'[ \t\n]*,[ \t\n]*' -v OFS=',' '{$1=$1}1' file
BCTS1,07/09/2021,09:06:26,09:09:26,0 horas con 3 minutos

The above assumes that, like in the example you posted, you don't have any blank lines in the input. If you do then you could use this with GNU awk (for multi-char RS and \s shorthand):

$ awk -v RS='^$' -v ORS= -F'\\s*,\\s*' -v OFS=',' '{$1=$1}1' file
BCTS1,07/09/2021,09:06:26,09:09:26,0 horas con 3 minutos

or this with any awk:

$ awk '{r=r $0 OFS} END{$0=r; gsub(/[ \t]*,[ \t]*/,","); print}' file
BCTS1,07/09/2021,09:06:26,09:09:26,0 horas con 3 minutos
Ed Morton
  • 188,023
  • 17
  • 78
  • 185