0

I have been trying to write script which will parse a CSV file and give the output in a specified format.

The input file is in the below format.

collectionBeginTime,ID,MU,hostname,Granularity,SampleInterval,suspectFlag,memCpuUsage,memUsedMemory,memMemoryCapacity,memRequestNum,memOnlineUserNum,memUsedLogDisk,memLogDiskCapacity,freeCPUUsage,freeMemory,freeLogDisk
2015-11-27 17:30:00-0500,NE=2106384,hwMEMPerformanceCollect,PG_172.16.169.70,900,900,0,24,7130,36153,0,1554,23026,157239,76,29023,134213
2015-11-27 17:30:00-0500,NE=2106386,hwMEMPerformanceCollect,PG_172.16.169.68,900,900,0,4,7481,36153,0,1594,22778,157239,96,28672,134461

Output is expected to be in the format (showing only a few of the output lines for the first line of the input):

collectionBeginTime   ,     hostname     ,     Parameters
2015-11-27 17:30:00-0500, PG_172.16.169.70, SampleInterval:900
2015-11-27 17:30:00-0500, PG_172.16.169.70, suspectFlag:0 

I need to print columns 1 and 4 for each line after the first, followed by the column name (from line 1 of the file), : and the column value for columns 6..NF (ignoring columns 2, 3, 5 altogether). A single input line generates many output lines.

The script I have written:

#!/bin/bash

FILENAME=$1

awk -F',' 'BEGIN{OFS=",";}  { if ( NR!=1 )print $1,$4,$6,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17}' < $FILENAME >> tmp.txt

echo "completed"

The script is running but showing all the parameters on the same line without its name. How do I fix it?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Anirban Roy
  • 111
  • 1
  • 1
  • 9
  • 1
    You capture the fields in line 1 for reuse (`for (i = 4; i <= NF; i++) name[i] = $i;`). In the other lines, you iterate over fields 4..NF printing relevant data, probably with `printf`. – Jonathan Leffler Jun 21 '16 at 18:44
  • Where do these Parameters come from? They're absent from your input sample – Aaron Jun 21 '16 at 18:48
  • In the output colum no 2 , 3, 5 and 7 must be omitted and then 6 onwards it will be concatenated with the paramter name like (SampleInterval:900). Can you please write the awk statement I suppose in full so that it can be more clear – Anirban Roy Jun 21 '16 at 18:51
  • @Aaron: They're there: columns 6..NF contain the values, and the entries in line 1 (fields 6..NF) contain the parameter names. Not obvious, I'll grant you, but the information is there. – Jonathan Leffler Jun 21 '16 at 18:52
  • nevermind then, I had a pretty one-liner with `cut` + `column` but it won't cut it for that use-case. – Aaron Jun 21 '16 at 18:53
  • @AnirbanRoy: you need to explain how column 7 gets to be omitted when you show it in your output. – Jonathan Leffler Jun 21 '16 at 18:53
  • sorry that is a mistake ... column 7 must not be omitted. I am really sorry for the misinformation – Anirban Roy Jun 21 '16 at 18:58

1 Answers1

0

You capture the fields in line 1 for reuse. In the other lines, you iterate over fields 6..NF printing relevant data:

awk -F',' 'NR == 1 { for (i = 6; i <= NF; i++) name[i] = $i
                     printf "%s, %s, %s\n", $1, $4, "Parameters"; next }
           { for (i = 6; i <= NF; i++) printf "%s, %s, %s:%s\n", $1, $4, name[i], $i; }'

Untested code.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278