Questions tagged [csv]

Comma-Separated Values or Character-Separated Values (CSV) is a common "flat file database" (or spreadsheet-style) format for storing tabular data in plain text, with fields separated by a special character (comma, tab, etc). Rows are typically denoted by newline characters. Use for any delimited file formats, including tab delimited (TSV)

CSV is a file format involving a plain text file with information separated by delimiters with the purpose of storing data in a table-structured format. CSV (comma separated values) files traditionally and most commonly use a comma delimiter (hence the name), but other characters can be used, such as semi-colons, tabs, pipe symbols (|), etc.

The MIME type for CSV files is text/csv.

Information is often stored in CSV format to make it easy to transfer tables of data between applications. Each row of a table is represented as a list of plain text (human-readable) values with a delimiter character between each discrete piece of data. Values may be enclosed in quotes, which is required if they contain the delimiter as a value. The first row of data often contains headers of table's columns, which describe the meaning of the data in each column.

Example

Tabular format

Time Temperature Humidity Description
08:00 70 35 Sunny and Clear
11:45 94 90 Hazy, Hot, and Humid
14:30 18 Freezing
16:00 -200 "Unliveable"

CSV format

Time,Temperature,Humidity,Description
08:00,70,35,Sunny and Clear
11:45,94,90,"Hazy, Hot, and Humid"
14:30,18,,Freezing
16:00,-200,,""Unliveable""

In this example, the first row of CSV data serves as the "header", which describes the corresponding data below it. There is no inherent way to describe within a CSV file whether the first row is a header row or not. Each successive line of the CSV file should neatly fit into the same field as the first line.

Note:

  • Empty fields (fields with no available data, such as the third field in the last line) are place-held with commas so that the fields that follow may be correctly placed.
  • Since the comma is the delimiter for fields, the commas in the Description field of the second line must be quoted (to prevent them from being interpreted as field delimiters). Wrapping the entire field in double quotes (") is the default method for protecting the delimiter character inside a field.
  • Since the double-quote is the delimiter quote character, double-quotes in the data, as in "Unliveable" on the fourth line, must also be protected. Doubling-up the double-quote is the default method for protecting the quote character inside a field.

Questions tagged are expected to relate to programming in some way, for example, parsing/importing CSV files or creating them programmatically.

Related links:

89606 questions
367
votes
12 answers

Convert xlsx to csv in Linux with command line

I'm looking for a way to convert xlsx files to csv files on Linux. I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called…
user1390150
  • 3,679
  • 3
  • 14
  • 3
363
votes
21 answers

How do I import CSV file into a MySQL table?

I have an unnormalized events-diary CSV from a client that I'm trying to load into a MySQL table so that I can refactor into a sane format. I created a table called 'CSVImport' that has one field for every column of the CSV file. The CSV contains 99…
Iain Samuel McLean Elder
  • 19,791
  • 12
  • 64
  • 80
352
votes
6 answers

CSV in Python adding an extra carriage return, on Windows

import csv with open('test.csv', 'w') as outfile: writer = csv.writer(outfile, delimiter=',', quoting=csv.QUOTE_MINIMAL) writer.writerow(['hi', 'dude']) writer.writerow(['hi2', 'dude2']) The above code generates a file, test.csv, with…
apalopohapa
  • 4,983
  • 5
  • 27
  • 29
349
votes
9 answers

Split a comma-delimited string into an array?

I need to split my string input into an array at the commas. Is there a way to explode a comma-separated string into a flat, indexed array? Input: 9,admin@example.com,8 Output: ['9', 'admin@example', '8']
Kevin
  • 23,174
  • 26
  • 81
  • 111
348
votes
8 answers

_csv.Error: field larger than field limit (131072)

I have a script reading in a csv file with very huge fields: # example from http://docs.python.org/3.3/library/csv.html?highlight=csv%20dictreader#examples import csv with open('some.csv', newline='') as f: reader = csv.reader(f) for row in…
user1251007
  • 15,891
  • 14
  • 50
  • 76
340
votes
19 answers

Parsing CSV files in C#, with header

Is there a default/official/recommended way to parse CSV files in C#? I don't want to roll my own parser. Also, I've seen instances of people using ODBC/OLE DB to read CSV via the Text driver, and a lot of people discourage this due to its…
David Pfeffer
  • 38,869
  • 30
  • 127
  • 202
328
votes
39 answers

How can I read and parse CSV files in C++?

I need to load and use CSV file data in C++. At this point it can really just be a comma-delimited parser (ie don't worry about escaping new lines and commas). The main need is a line-by-line parser that will return a vector for the next line each…
User1
  • 39,458
  • 69
  • 187
  • 265
327
votes
18 answers

Turning a Comma Separated string into individual rows

I have a SQL Table like this: SomeID OtherID Data abcdef-..... cdef123-... 18,20,22 abcdef-..... 4554a24-... 17,19 987654-..... 12324a2-... 13,19,20 Is there a query where I can perform a query like SELECT OtherID, SplitData WHERE…
Michael Stum
  • 177,530
  • 117
  • 400
  • 535
312
votes
27 answers

How can I convert JSON to CSV?

I have a JSON file I want to convert to a CSV file. How can I do this with Python? I tried: import json import csv f = open('data.json') data = json.load(f) f.close() f = open('data.csv') csv_file = csv.writer(f) for item in data: …
little_fish
  • 4,169
  • 5
  • 20
  • 8
311
votes
16 answers

How do I read a large csv file with pandas?

I am trying to read a large csv file (aprox. 6 GB) in pandas and i am getting a memory error: MemoryError Traceback (most recent call last) in () ----> 1…
Rajkumar Kumawat
  • 3,627
  • 3
  • 12
  • 8
300
votes
12 answers

How to get rid of "Unnamed: 0" column in a pandas DataFrame read in from CSV file?

I have a situation wherein sometimes when I read a csv from df I get an unwanted index-like column named unnamed:0. file.csv ,A,B,C 0,1,2,3 1,4,5,6 2,7,8,9 The CSV is read with this: pd.read_csv('file.csv') Unnamed: 0 A B C 0 0 …
Collective Action
  • 7,607
  • 15
  • 45
  • 60
298
votes
5 answers

How to skip the headers when processing a csv file using Python?

I am using below referred code to edit a csv using Python. Functions called in the code form upper part of the code. Problem: I want the below referred code to start editing the csv from 2nd row, I want it to exclude 1st row which contains headers.…
user1915050
297
votes
8 answers

How to append a new row to an old CSV file in Python?

I am trying to add a new row to my old CSV file. Basically, it gets updated each time I run the Python script. Right now I am storing the old CSV rows values in a list and then deleting the CSV file and creating it again with the new list value. I…
laspal
  • 3,337
  • 3
  • 21
  • 10
294
votes
6 answers

How to properly escape a double quote in CSV?

I have a line like this in my CSV: "Samsung U600 24"","10000003409","1","10000003427" Quote next to 24 is used to express inches, while the quote just next to that quote closes the field. I'm reading the line with fgetcsv but the parser makes a…
srgb
  • 4,783
  • 6
  • 29
  • 45
279
votes
7 answers

How to add header row to a pandas DataFrame

I am reading a csv file into pandas. This csv file consists of four columns and some rows, but does not have a header row, which I want to add. I have been trying the following: Cov = pd.read_csv("path/to/file.txt", sep='\t') Frame =…
sequence_hard
  • 5,115
  • 10
  • 30
  • 50