5

Just wondering how can I add single quotes around fields, so I can import it to mysql without warnings or errors.

I have a csv file with lots of content.

16:47:11,3,r-4-VM,250000000.,0.50822578824,131072,0,0,0,0,0

Desired output

'16:47:07','3','r-4-VM','230000000.','0.466028518635','131072','0','0','0','60','0'

I really have no idea where to start, your help is highly appreciated.

Deano
  • 11,582
  • 18
  • 69
  • 119

6 Answers6

19

You could try this

awk -F"," -v quote="'" -v OFS="','" '$1=$1 {print quote $0 quote}' file
  1. Replace each separator (, comma) with (',' quote-comma-quote) -> (-F"," -v OFS="','")
  2. Add quotes to the begin and end of line -> (print quote $0 quote)
Alper
  • 12,860
  • 2
  • 31
  • 41
  • 2
    The `$1=$1` idiom confused me, but Google found an explanation for me. By assigning to even one field you force AWK to rebuild the whole line, thus changing the original field separator to the new one. http://backreference.org/2010/02/10/idiomatic-awk/ – steveha May 10 '13 at 21:39
2

awk and sed are not going to (easily) determine whether the field separator (,) is escaped or not. The csv file format escapes , characters within fields by surrounding the whole field in double quotes (see Section 2.6 of RFC4180).

As I describe in this answer, a more robust method is to use a csv library, rather than parsing as text using regular expressions and the like.

I found Python's library was the best choice because it's:

  1. widely available without onerous dependencies, with the exception of Python itself;
  2. not particular sensitive to the version of Python you use;
  3. lends itself to being embedded within a shell script; and
  4. is quite compact (a one-liner will do!).

Based on the question's tags, I suspect these criteria will be appealing to you too.

Thus, try the following:

QUOTE_CSV_PY="import sys; import csv; csv.writer(sys.stdout, quoting=csv.QUOTE_ALL, quotechar=\"'\").writerows(csv.reader(sys.stdin))"
python -c "$QUOTE_CSV_PY" < file

To break it down:

  • QUOTE_CSV_PY is a shell variable containing the Python one-liner commands
  • The Python commands simply:
    • import the standard sys and csv modules;
    • create a csv writer that writes to standard output (stdout) with QUOTE_ALL set so all fields get quoted using quotechar, which is set to a single quote;
    • feed the csv writer a csv reader that reads from standard input (stdin).
  • The second line simply passes the one-liner to the python interpreter, and feeds the csv file (called file) into its stdin.
Community
  • 1
  • 1
Heath Raftery
  • 3,643
  • 17
  • 34
  • That solution is particularly useful as it covers the case where a field with an embedded comma has quotes around it, which few people consider – vicarage Dec 30 '22 at 09:16
1
#!/usr/bin/awk -f

BEGIN { FS=OFS=","}

{
    for (i = 1; i <= NF; ++i)
        $i = "'" $i "'"
    print
}

At the beginning, set FS (the field separator) to a comma; also set OFS, the output field separator, to a comma.

For every input line, loop over all fields. NF is the number of fields parsed out of the current line. Set each field to its own value surrounded by single quotes.

When done updating the fields, print the modified line.

steveha
  • 74,789
  • 21
  • 92
  • 117
1

try this:

awk '{gsub(/^|$/,"\x027");gsub(/,/,"\x027,\x027")}7' file

example

kent$  echo "16:47:11,3,r-4-VM,250000000.,0.50822578824,131072,0,0,0,0,0"|awk '{gsub(/^|$/,"\x027");gsub(/,/,"\x027,\x027")}7'
'16:47:11','3','r-4-VM','250000000.','0.50822578824','131072','0','0','0','0','0'
Kent
  • 189,393
  • 32
  • 233
  • 301
1

This might work for you (GNU sed):

sed -r 's/[^,]+/'\''&'\''/g' file

or:

sed -r "s/[^,]+/'&'/g" file
potong
  • 55,640
  • 6
  • 51
  • 83
0
awk 'BEGIN{FS=OFS=","}{for (i=1;i<=NF;++i)  $i="~"$i"~"}{print}' $input_csv_file

This works. Here I am enclosing all the csv file columns with a ~.

Sardar Usama
  • 19,536
  • 9
  • 36
  • 58