16

I'm trying to write a list of strings like below to a file separated by the given delimiter.

res = [u'123', u'hello world']

When I try splitting by TAB like below it gives me the correctly formatted string.

writer = csv.writer(sys.stdout, delimiter="\t")
writer.writerow(res)

gives --> 123   hello world

But when I try to split by space using delimiter=" ", it gives me the space but with quotation marks like below.

123 "hello world"

How do I remove quotation marks. So that when I use space as the delimiter I should get 123 hello world.

EIDT: when I try using the escapechar it doesn't make any double quotes. But everywhere in my testdata it appears a space, it makes it double.

samsamara
  • 4,630
  • 7
  • 36
  • 66
  • 6
    Did you read the doc of **The Standard Python Library** and did you try `writer = csv.writer(sys.stdout, delimiter="\t", quoting = csv.QUOTE_NONE)` ? – Serge Ballesta May 27 '14 at 06:12
  • 2
    The issue as 'Peter DeGlopper 2' pointed out, I have delimiter appear in my test data. – samsamara May 27 '14 at 07:17

4 Answers4

24

You can set the csv.writer to quote nothing with quoting=csv.QUOTE_NONE for example:

import csv
with open('eggs.csv', 'wb') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=' ',
                            escapechar=' ', quoting=csv.QUOTE_NONE)
    spamwriter.writerow(['Spam'] * 5 + ['Baked Beans'])
    spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])

Produces:

Spam Spam Spam Spam Spam Baked  Beans
Spam Lovely  Spam Wonderful  Spam

If you do QUOTING_NONE you also need and escape character.

Dair
  • 15,910
  • 9
  • 62
  • 107
12

Quoting behavior is controlled by the various quoting arguments provided to the writer (or set on the Dialect object if you prefer to do things that way). The default setting is QUOTE_MINIMAL, which will not produce the behavior you're describing unless a value contains your delimiter character, quote character, or line terminator character. Doublecheck your test data - [u'123', u'hello'] won't produce what you describe, but [u'123', u' hello'] would.

You can specify QUOTE_NONE if you're sure that's the behavior you want, in which case it'll either try to escape instances of your delimiter character if you set an escape character, or raise an exception if you don't.

Peter DeGlopper
  • 36,326
  • 7
  • 90
  • 83
  • 1
    Informative answer. Yes my test data contains the delimiter. So I used csv.writer(sys.stdout, delimiter=' ',escapechar=' ',quoting=csv.QUOTE_NONE ). Then it doesn't have the double quotes, but for each space in my test data, is replaced by 2 spaces. What is the solution for this? – samsamara May 27 '14 at 07:15
  • Well, what behavior do you want for cases in which your test data contains the delimiter? The library is trying to avoid creating output that cannot be reliably read using the same settings - that is, it must be able to distinguish between the output from `['foo', 'bar', 'baz']` and `['foo', 'bar baz']`. And for that matter `['foo', 'bar', ' baz']`. If those potential ambiguities don't matter to you, you might be better off just using `' '.join` as John Mee suggested. I generally think it's best to preserve those distinctions but it entirely depends on your situation. – Peter DeGlopper May 27 '14 at 07:20
  • I tested that when delimiter appears in test data csv fails but join() still does the work. So which one is preferred given that my delimiter could appear in my test data (join or csv)? – samsamara May 27 '14 at 07:40
  • `csv` will produce unambiguous output, in which a later parser can distinguish whether the delimiter is present because it's dividing fields or because it was present in the input. `join` will not - it will produce the same output for `['foo', 'bar', 'baz']` and `['foo', 'bar baz']`. For that reason I consider `csv` generally better, but if what your next step needs is `'foo bar baz'` in both cases `join` will give it to you with a lot less work. – Peter DeGlopper May 27 '14 at 07:47
4

Do you need the csv lib? Just join the strings...

>>> res = [u'123', u'hello'] 
>>> print res
[u'123', u'hello']
>>> print " ".join(res)
123 hello
John Mee
  • 50,179
  • 34
  • 152
  • 186
  • This is a fair point - the csv library is wonderful for helping you avoid confusion when your delimiter might appear in the text, but if you're sure it doesn't and you don't need quoting it's overkill. – Peter DeGlopper May 27 '14 at 07:13
1

What worked for me was using a regular writer, not the csv.writer, and simply use your delimiter in between columns ('\t' in my case):

with open(target_path, 'w', encoding='utf-8') as fd:
     # some code iterating over a pandas daftaframe called mydf     
     # create a string out of column 0, '\t' (tab delimiter) and column 1:
     output = mydf.loc[i][0] + '\t' + mydf.loc[i][1] +'\n'
     # write that output string (line) to the file in every iteration
     fd.write(output)

It might not be the "correct" way but it definitely kept the original lines in my project, which included many strings and quotations.

bhristov
  • 3,137
  • 2
  • 10
  • 26
Noaha
  • 11
  • 3