5

I attempting to print all data between double quotes (sampleField="sampleValue"), but am having trouble to get awk and/or sub/gsub to return all instances of data between the double quotes. I'd then like to print all instances on the respective lines they were found to keep the data together.

Here is a sample of the input.txt file:

deviceId="1300", deviceName="router 13", deviceLocation="Corp"
deviceId="2000", deviceName="router 20", deviceLocation="DC1"

The output I'm looking for is:

"1300", "router 13", "Corp"
"2000", "router 20", "DC1"

I'm having trouble using gsub to remove all of the data between a , and =. Each time I've tried a different approach, it always just returns the first field and moves onto the next line.

UPDATE:

I forgot to mention that I won't know how many double quote encapsulated fields will be on each line. It could be 1, 3, or 5,000. Not sure if this affects the solution, but wanted to make sure it was out there.

Travis Crooks
  • 139
  • 3
  • 12

6 Answers6

5

A sed solution:

sed -r 's/[^\"]*([\"][^\"]*[\"][,]?)[^\"]*/\1 /g'
    <<< 'deviceId="1300", deviceName="router 13", deviceLocation="Corp"'

Output:

"1300", "router 13", "Corp"

Or for a file:

sed -r 's/[^\"]*([\"][^\"]*[\"][,]?)[^\"]*/\1 /g' input.txt
Rubens
  • 14,478
  • 11
  • 63
  • 92
2
awk -F '"' '{printf(" %c%s%c, %c%s%c, %c%s%c\n", 34,$2, 34, 34, $4,34, $6, 34) } ' \
    input file > newfile

is another simpler approach, using quote as a field separator.

awk 'BEGIN{ t=sprintf("%c", 34)}
     { for(i=1; i<=NF; i++){
        if(index($i,t) ){print $i}  }; printf("\n")}'  infile > outfile

More general awk approach.

jim mcnamara
  • 16,005
  • 2
  • 34
  • 51
1
awk -F \" '
    {
        sep=""
        for (i=2; i<=NF; i+=2) {
            printf "%s\"%s\"", sep, $i
            sep=", "
        }
        print ""
    }
' << END
deviceId="1300", deviceName="router 13", deviceLocation="Corp", foo="bar"
deviceId="2000", deviceName="router 20", deviceLocation="DC1"
END

outputs

"1300", "router 13", "Corp", "bar"
"2000", "router 20", "DC1"
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
1

awk/sub/gsub/ is the probably neither the most direct way nor the easiest way to get it done. I like one-liners when they make sense:

(1) In Perl:

172-30-3-163:ajax vphuvan$ perl -pe 's/device.*?=//g' input.txt
"1300", "router 13", "Corp"
"2000", "router 20", "DC1"

where 
-p means "print to screen"
-e means execute the statement between the single quotes
s is a regular expression command which gives the instruction to substitute
g is the switch for the regular expression. /g instructs the program to carry out the substitution /device.*?=// wherever applicable
/device.*?=// is an instruction to replace with an empty string '' any expression that starts with the prefix "device" and that ends just before the closest "=" sign. Note that "deviceId", "deviceName"  and "deviceLocation" all start with the prefix "device" and each of them ends just before the "=" sign

(2) In bash:

172-30-3-163:ajax vphuvan$ sed "s/deviceId=//; s/deviceName=//; s/deviceLocation=//" input.txt
"1300", "router 13", "Corp"
"2000", "router 20", "DC1"

In this case, we are instructing sed to run three substitution instructions in a row where "deviceId", "deviceName" and "deviceLocation are each replaced with an empty string ''

It is unfortunate that sed (and sub and gsub) has much weaker support for regular expressions than Perl, which is the gold standard for full regular expression support. In particular, neither sed nor sub/gsub support the non-greedy instruction"?", and this failure considerably complicates my life.

Vietnhi Phuvan
  • 2,704
  • 2
  • 25
  • 25
0

try this

awk -F\" '{ for(i=2; i<=NF; i=i+2){ a = a"\""$i"\""",\t";} {print a; a="";}}' temp.txt

output

"1300",  "router 13",     "Corp"
"2000",  "router 20",     "DC1"
Mirage
  • 30,868
  • 62
  • 166
  • 261
0

This is too late but One probable easy solution would be:

 $ awk -F"=|," '{print $2,$4,$6}' input.txt
"1300" "router 13" "Corp"
"2000" "router 20" "DC1"
krock1516
  • 441
  • 10
  • 30