83

I'm having trouble getting the pandas dataframe.to_csv(...) output quoting strings right.

import pandas as pd

text = 'this is "out text"'
df = pd.DataFrame(index=['1'],columns=['1','2'])
df.loc['1','1']=123
df.loc['1','2']=text
df.to_csv('foo.txt',index=False,header=False)

The output is:

123,"this is ""out text"""

But I would like:

123,this is "out text"

Does anyone know how to get this right?

Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
user3199761
  • 835
  • 1
  • 6
  • 6

5 Answers5

125

You could pass quoting=csv.QUOTE_NONE, for example:

>>> df.to_csv('foo.txt',index=False,header=False)
>>> !cat foo.txt
123,"this is ""out text"""
>>> import csv
>>> df.to_csv('foo.txt',index=False,header=False, quoting=csv.QUOTE_NONE)
>>> !cat foo.txt
123,this is "out text"

but in my experience it's better to quote more, rather than less.

Nico
  • 12,493
  • 5
  • 42
  • 62
DSM
  • 342,061
  • 65
  • 592
  • 494
  • 12
    I tried this but got "Error: need to escape, but no escapechar set". I also tried to add escapechar='\\' or escapechar=None, but does not seem to work... – user3199761 Jan 15 '14 at 22:19
  • 7
    @user3199761: that's because your real data -- unlike the example you pasted -- has something that you need to escape, such as a string with a comma. If you output `a,"b,c",d` you can parse the results, but what are the columns if you write out `a,b,c,d`? You can't tell. – DSM Jan 15 '14 at 22:22
  • I did try your code with the same example above, but got the error. I am using pandas 0.11.0 and python 2.7.6, not sure if this helps. – user3199761 Jan 15 '14 at 22:31
  • @user3199761: it might, though I'd be a little surprised. I'm using (pre-release) 0.13.0. – DSM Jan 15 '14 at 22:42
  • @DSM, I also tried your code, and it returned the exact same error. – ericmjl Jan 15 '14 at 23:22
  • @DSM: both pandas 0.11.0 and 0.12.0 does not work, but upgrading to pandas 0.13.0 solved the problem! – user3199761 Jan 16 '14 at 00:01
  • 1
    so does this mean that in order to disable quoting, you need to load another library and call a function from it within the function for pandas? – user5359531 Jul 15 '16 at 23:05
  • 2
    @user5359531: no, it means you import the built-in `csv` library just to get the `csv.QUOTE_NONE` constant (which happens to be the number 3, but you've got no guarantee that's always going to be true). – DSM Jul 15 '16 at 23:13
  • Got error "Error: need to escape, but no escapechar set". I'm aware, that there might be issues with columns, but need the solution anyway. – Leukonoe May 11 '17 at 11:55
  • This solution works for me, even with commas into some strings. I just use a different separator : `df.to_csv(f"./export_{today}-{current_time}.tsv", sep="\t", index=False, quoting=csv.QUOTE_NONE)` – DamDam Jun 06 '22 at 08:41
34

Note: there is currently a small error in the Pandas to_string documentation. It says:

  • quoting : int, Controls whether quotes should be recognized. Values are taken from csv.QUOTE_* values. Acceptable values are 0, 1, 2, and 3 for QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONE, and QUOTE_NONNUMERIC,
    respectively.

But this reverses how csv defines the QUOTE_NONE and QUOTE_NONNUMERIC variables.

In [13]: import csv
In [14]: csv.QUOTE_NONE
Out[14]: 3
Owen
  • 3,063
  • 5
  • 30
  • 26
30

To use quoting=csv.QUOTE_NONE, you need to set the escapechar, e.g.

# Create a tab-separated file with quotes
$ echo abc$'\t'defg$'\t'$'"xyz"' > in.tsv
$ cat in.tsv
abc defg    "xyz"

# Gotcha the quotes disappears in `"..."`
$ python3
>>> import pandas as pd
>>> import csv
>>> df = pd.read("in.tsv", sep="\t")
>>> df = pd.read_csv("in.tsv", sep="\t")
>>> df
Empty DataFrame
Columns: [abc, defg, xyz]
Index: []


# When reading in pandas, to read the `"..."` quotes,
# you have to explicitly say there's no `quotechar`
>>> df = pd.read_csv("in.tsv", sep="\t", quotechar='\0')
>>> df
Empty DataFrame
Columns: [abc, defg, "xyz"]
Index: []

# To print out without the quotes.
>> df.to_csv("out.tsv", , sep="\t", quoting=csv.QUOTE_NONE, quotechar="",  escapechar="\\")
alvas
  • 115,346
  • 109
  • 446
  • 738
4

To use without escapechar:

Replace comma char , (Unicode:U+002C) in your df with an single low-9 quotation mark character (Unicode: U+201A)

After this, you can simply use:

import csv df.to_csv('foo.txt', index=False, header=False, quoting=csv.QUOTE_NONE)

penduDev
  • 4,743
  • 35
  • 37
0

If you don't want to bother with importing csv, you simply can use the following line

df.to_csv('foo.txt', index=False, header=False, quoting=3,sep=",",escapechar=",")

Simone
  • 813
  • 8
  • 21