0

Hi I have a CSV data in following format

ColumnHeader1,ColumnHeader2,ColumnHeader3
valcol1p1,name=testapp1 environment=dev coldata=My_Test_Logs @$ 192.168.1.1 @$ r1 @$ r2 @$ POST API ,valcol3p1
valcol1p1,name=testapp2 environment=qa coldata=My_Test_Logs @$ 192.168.1.1 @$ r1 @$ r2 @$ GET API ,valcol3p1 

img sample data I

I need to extract the data in ColumnHeader2 column after My_Test_Logs and parse the data after the delimtter '@$'.So for each csv line I would get 4 values. I need to concatenate them with the same delimiter value '@$' and place in CSV.

The output will be something like this

Output

Now i have solved it in parts.

Like to get the ColumnHeader2 column data

awk -F "\"*,\"*" '{print $2}' Mytest.csv

or to take only first x fields using multiple chars delimiter:

awk -F"[@][$]" '{print $1,$2,$3,$4}' Mytest1.csv
where MyTest1 contains the output of extracted Columnheader2 data

But together the whole logic of extracting and then concatenating is giving some issues .Can someone please help here.I need a single script to work on my CSV and write the results in another csv rather using multiple csv or text outputs in between?

pythonNinja
  • 453
  • 5
  • 13
  • 1
    What should the output be for that sample data? – Shawn Mar 11 '20 at 11:48
  • Something like this..Showed just one row but actually it would be all the rows from the csv 192.168.1.1 @$ r1 @$ r2 @$ GET API – pythonNinja Mar 11 '20 at 11:50
  • Please, edit the sample output to the question for readability. – James Brown Mar 11 '20 at 11:55
  • @JamesBrown done as suggested – pythonNinja Mar 11 '20 at 12:01
  • 1
    You are on the right track, you just need to set the OFS variable also `awk -F ',| *@[$] *' -v OFS='@$' '{print $3,$4,$5}' sample.txt` (this combines both awk commands) – Sorin Mar 11 '20 at 12:32
  • Perfect. @Sorin ...Any idea how I can get output with new columns as I have shown in my output. Now I get the output like this with your command @$@$mnHeader3 192.168.1.1@$r1@$r2 192.168.1.1@$r1@$r2 What I want is something this outCol1@$outCol2@$outcol3 192.168.1.1@$r1@$r2 192.168.1.1@$r1@$r2 – pythonNinja Mar 11 '20 at 12:44
  • 1
    @pythonNinja well I wasn't planning to write an answer. However, there you go https://stackoverflow.com/a/60636646/939457 – Sorin Mar 11 '20 at 13:03

3 Answers3

1

This should met both your requirements:

awk -F',| *@[$] *' -v OFS='@$' -e 'NR==1 {print "outCol1","outCol2","outcol3"}; NR > 1 { print $3,$4,$5}' sample.txt

outCol1@$outCol2@$outcol3
192.168.1.1@$r1@$r2
192.168.1.1@$r1@$r2
  • -F',|@[$]' - field separator is a regular expression in awk, this will match both,and@$, and also trim the spaces around@$`
  • -v OFS='@$' - the default field separator for output is space, this will set it to '@$'
  • NR == 1 {print "outCol1","outCol2","outcol3"} - for first line print the new header
  • NR > 1 { print $3,$4,$5} - for the rest of the lines print fields $3,$4,$5

Note: this assumes that there are no , or @$ escaped in the rest of the CSV, otherwise you should use a proper CSV parser.

Sorin
  • 5,201
  • 2
  • 18
  • 45
0

Using awk to split at My_Test_Logs:

$ awk -F, '{print (split($2,a,/My_Test_Logs /)>1?a[2]:a[1])}' file

Output:

ColumnHeader2
@$ 192.168.1.1 @$ r1 @$ r2 @$ POST API 
@$ 192.168.1.1 @$ r1 @$ r2 @$ GET API 

Update:

In the light of better presented expected output:

$ awk -F, '{print (split($2,a,/ /)>1?a[5] OFS a[6] OFS a[7] OFS a[8] OFS a[9] OFS a[10] OFS a[11] OFS a[12]:a[1])}' file

Output now:

ColumnHeader2
192.168.1.1 @$ r1 @$ r2 @$ POST API
192.168.1.1 @$ r1 @$ r2 @$ GET API
James Brown
  • 36,089
  • 7
  • 43
  • 59
0

This will handle the CSV in a safe way unlike most bash tools.

#!/usr/local/bin/python3

import csv
csvr = csv.reader(open('your_file.csv'), delimiter = ",")

results = []
index=0
for row in csvr:
    if index == 0:
        index = index +1
        continue
    results.append(row[1].split('@$')[1:])

for row in results:
    print(",".join(row))

Output :

 192.168.1.1 , r1 , r2 , POST API 
 192.168.1.1 , r1 , r2 , GET API

Hope it helps!

Matias Barrios
  • 4,674
  • 3
  • 22
  • 49