Questions tagged [csv]

Comma-Separated Values or Character-Separated Values (CSV) is a common "flat file database" (or spreadsheet-style) format for storing tabular data in plain text, with fields separated by a special character (comma, tab, etc). Rows are typically denoted by newline characters. Use for any delimited file formats, including tab delimited (TSV)

CSV is a file format involving a plain text file with information separated by delimiters with the purpose of storing data in a table-structured format. CSV (comma separated values) files traditionally and most commonly use a comma delimiter (hence the name), but other characters can be used, such as semi-colons, tabs, pipe symbols (|), etc.

The MIME type for CSV files is text/csv.

Information is often stored in CSV format to make it easy to transfer tables of data between applications. Each row of a table is represented as a list of plain text (human-readable) values with a delimiter character between each discrete piece of data. Values may be enclosed in quotes, which is required if they contain the delimiter as a value. The first row of data often contains headers of table's columns, which describe the meaning of the data in each column.

Example

Tabular format

Time Temperature Humidity Description
08:00 70 35 Sunny and Clear
11:45 94 90 Hazy, Hot, and Humid
14:30 18 Freezing
16:00 -200 "Unliveable"

CSV format

Time,Temperature,Humidity,Description
08:00,70,35,Sunny and Clear
11:45,94,90,"Hazy, Hot, and Humid"
14:30,18,,Freezing
16:00,-200,,""Unliveable""

In this example, the first row of CSV data serves as the "header", which describes the corresponding data below it. There is no inherent way to describe within a CSV file whether the first row is a header row or not. Each successive line of the CSV file should neatly fit into the same field as the first line.

Note:

  • Empty fields (fields with no available data, such as the third field in the last line) are place-held with commas so that the fields that follow may be correctly placed.
  • Since the comma is the delimiter for fields, the commas in the Description field of the second line must be quoted (to prevent them from being interpreted as field delimiters). Wrapping the entire field in double quotes (") is the default method for protecting the delimiter character inside a field.
  • Since the double-quote is the delimiter quote character, double-quotes in the data, as in "Unliveable" on the fourth line, must also be protected. Doubling-up the double-quote is the default method for protecting the quote character inside a field.

Questions tagged are expected to relate to programming in some way, for example, parsing/importing CSV files or creating them programmatically.

Related links:

89606 questions
163
votes
18 answers

Parsing a comma-delimited std::string

If I have a std::string containing a comma-separated list of numbers, what's the simplest way to parse out the numbers and put them in an integer array? I don't want to generalise this out into parsing anything else. Just a simple string of comma…
Piku
  • 3,526
  • 6
  • 35
  • 38
163
votes
21 answers

How to obtain the total numbers of rows from a CSV file in Python?

I'm using python (Django Framework) to read a CSV file. I pull just 2 lines out of this CSV as you can see. What I have been trying to do is store in a variable the total number of rows the CSV also. How can I get the total number of rows? file =…
GrantU
  • 6,325
  • 16
  • 59
  • 89
161
votes
25 answers

How to convert JSON to CSV format and store in a variable

I have a link that opens up JSON data in the browser, but unfortunately I have no clue how to read it. Is there a way to convert this data using JavaScript in CSV format and save it in JavaScript file? The data looks like: { "count": 2, "items":…
praneybehl
  • 3,801
  • 6
  • 28
  • 45
160
votes
7 answers

ValueError : I/O operation on closed file

import csv with open('v.csv', 'w') as csvfile: cwriter = csv.writer(csvfile, delimiter=' ', quotechar='|', quoting=csv.QUOTE_MINIMAL) for w, c in p.items(): cwriter.writerow(w + c) Here, p is a dictionary, w and c both are…
GobSmack
  • 2,171
  • 4
  • 22
  • 28
154
votes
16 answers

php implode (101) with quotes

Imploding a simple array would look like this $array = array('lastname', 'email', 'phone'); $comma_separated = implode(",", $array); and that would return this lastname,email,phone great, so i might do this instead $array = array('lastname',…
mcgrailm
  • 17,469
  • 22
  • 83
  • 129
153
votes
25 answers

Importing CSV with line breaks in Excel 2007

I'm working on a feature to export search results to a CSV file to be opened in Excel. One of the fields is a free-text field, which may contain line breaks, commas, quotations, etc. In order to counteract this, I have wrapped the field in double…
jeremyalan
  • 4,658
  • 2
  • 29
  • 38
150
votes
6 answers

How to parse a CSV file in Bash?

I'm working on a long Bash script. I want to read cells from a CSV file into Bash variables. I can parse lines and the first column, but not any other column. Here's my code so far: cat myfile.csv|while read line do read -d, col1 col2 <…
User1
  • 39,458
  • 69
  • 187
  • 265
150
votes
19 answers

Importing a CSV file into a sqlite3 database table using Python

I have a CSV file and I want to bulk-import this file into my sqlite3 database using Python. the command is ".import .....". but it seems that it cannot work like this. Can anyone give me an example of how to do it in sqlite3? I am using windows…
Hossein
  • 40,161
  • 57
  • 141
  • 175
150
votes
5 answers

What does 'killed' mean when processing a huge CSV with Python, which suddenly stops?

I have a Python script that imports a large CSV file and then counts the number of occurrences of each word in the file, then exports the counts to another CSV file. But what is happening is that once that counting part is finished and the exporting…
user1893354
  • 5,778
  • 12
  • 46
  • 83
150
votes
7 answers

Reading a huge .csv file

I'm currently trying to read data from .csv files in Python 2.7 with up to 1 million rows, and 200 columns (files range from 100mb to 1.6gb). I can do this (very slowly) for the files with under 300,000 rows, but once I go above that I get memory…
Charles Dillon
  • 1,945
  • 6
  • 15
  • 18
149
votes
15 answers

Which encoding opens CSV files correctly with Excel on both Mac and Windows?

We have a web app that exports CSV files containing foreign characters with UTF-8, no BOM. Both Windows and Mac users get garbage characters in Excel. I tried converting to UTF-8 with BOM; Excel/Win is fine with it, Excel/Mac shows gibberish. I'm…
Timm
  • 2,488
  • 2
  • 22
  • 25
149
votes
3 answers

Reading tab-delimited file with Pandas - works on Windows, but not on Mac

I've been reading a tab-delimited data file in Windows with Pandas/Python without any problems. The data file contains notes in first three lines and then follows with a header. df = pd.read_csv(myfile,sep='\t',skiprows=(0,1,2),header=(0)) I'm now…
user3062149
  • 4,173
  • 4
  • 17
  • 26
149
votes
9 answers

read.csv warning 'EOF within quoted string' prevents complete reading of file

I have a CSV file (24.1 MB) that I cannot fully read into my R session. When I open the file in a spreadsheet program I can see 112,544 rows. When I read it into R with read.csv I only get 56,952 rows and this warning: cit <-…
Ben
  • 41,615
  • 18
  • 132
  • 227
148
votes
10 answers

converting CSV/XLS to JSON?

Does anyone know if there is application that will let me convert preferably XLS to JSON? I'll also settle for a converter from CSV since that's what I'll probably end up having to write myself if there is nothing around.
mkoryak
  • 57,086
  • 61
  • 201
  • 257
148
votes
11 answers

How to read data when some numbers contain commas as thousand separator?

I have a csv file where some of the numerical values are expressed as strings with commas as thousand separator, e.g. "1,513" instead of 1513. What is the simplest way to read the data into R? I can use read.csv(..., colClasses="character"), but…
Rob Hyndman
  • 30,301
  • 7
  • 73
  • 85