Questions tagged [csv]

Comma-Separated Values or Character-Separated Values (CSV) is a common "flat file database" (or spreadsheet-style) format for storing tabular data in plain text, with fields separated by a special character (comma, tab, etc). Rows are typically denoted by newline characters. Use for any delimited file formats, including tab delimited (TSV)

CSV is a file format involving a plain text file with information separated by delimiters with the purpose of storing data in a table-structured format. CSV (comma separated values) files traditionally and most commonly use a comma delimiter (hence the name), but other characters can be used, such as semi-colons, tabs, pipe symbols (|), etc.

The MIME type for CSV files is text/csv.

Information is often stored in CSV format to make it easy to transfer tables of data between applications. Each row of a table is represented as a list of plain text (human-readable) values with a delimiter character between each discrete piece of data. Values may be enclosed in quotes, which is required if they contain the delimiter as a value. The first row of data often contains headers of table's columns, which describe the meaning of the data in each column.

Example

Tabular format

Time Temperature Humidity Description
08:00 70 35 Sunny and Clear
11:45 94 90 Hazy, Hot, and Humid
14:30 18 Freezing
16:00 -200 "Unliveable"

CSV format

Time,Temperature,Humidity,Description
08:00,70,35,Sunny and Clear
11:45,94,90,"Hazy, Hot, and Humid"
14:30,18,,Freezing
16:00,-200,,""Unliveable""

In this example, the first row of CSV data serves as the "header", which describes the corresponding data below it. There is no inherent way to describe within a CSV file whether the first row is a header row or not. Each successive line of the CSV file should neatly fit into the same field as the first line.

Note:

  • Empty fields (fields with no available data, such as the third field in the last line) are place-held with commas so that the fields that follow may be correctly placed.
  • Since the comma is the delimiter for fields, the commas in the Description field of the second line must be quoted (to prevent them from being interpreted as field delimiters). Wrapping the entire field in double quotes (") is the default method for protecting the delimiter character inside a field.
  • Since the double-quote is the delimiter quote character, double-quotes in the data, as in "Unliveable" on the fourth line, must also be protected. Doubling-up the double-quote is the default method for protecting the quote character inside a field.

Questions tagged are expected to relate to programming in some way, for example, parsing/importing CSV files or creating them programmatically.

Related links:

89606 questions
114
votes
6 answers

How to copy from CSV file to PostgreSQL table with headers in CSV file?

I want to copy a CSV file to a Postgres table. There are about 100 columns in this table, so I do not want to rewrite them if I don't have to. I am using the \copy table from 'table.csv' delimiter ',' csv; command but without a table created I get…
Soatl
  • 10,224
  • 28
  • 95
  • 153
114
votes
4 answers

How to make separator in pandas read_csv more flexible wrt whitespace, for irregular separators?

I need to create a data frame by reading in data from a file, using read_csv method. However, the separators are not very regular: some columns are separated by tabs (\t), other are separated by spaces. Moreover, some columns can be separated by 2…
Roman
  • 124,451
  • 167
  • 349
  • 456
112
votes
4 answers

Printing column separated by comma using Awk command line

I have a problem here. I have to print a column in a text file using awk. However, the columns are not separated by spaces at all, only using a single comma. Looks something like this: column1,column2,column3,column4,column5,column6 How would I…
user3364728
  • 1,147
  • 2
  • 7
  • 5
109
votes
10 answers

Reading a UTF8 CSV file with Python

I am trying to read a CSV file with accented characters with Python (only French and/or Spanish characters). Based on the Python 2.5 documentation for the csvreader (http://docs.python.org/library/csv.html), I came up with the following code to read…
Martin
  • 39,309
  • 62
  • 192
  • 278
109
votes
10 answers

CSV with comma or semicolon?

How is a CSV file built in general? With commas or semicolons? Any advice on which one to use?
membersound
  • 81,582
  • 193
  • 585
  • 1,120
108
votes
3 answers

How to specify a tab in a postgres front-end COPY

I would like to use the psql "\copy" command to pull data from a tab-delimited file into Postgres. I'm using this command: \copy cm_state from 'state.data' with delimiter '\t' null as ; But I'm getting this warning (the table actually loads…
Chris Curvey
  • 9,738
  • 10
  • 48
  • 70
108
votes
5 answers

Python Pandas read_csv skip rows but keep header

I'm having trouble figuring out how to skip n rows in a csv file but keep the header which is the 1 row. What I want to do is iterate but keep the header from the first row. skiprows makes the header the first row after the skipped rows. What is…
mcd
  • 6,446
  • 9
  • 27
  • 32
108
votes
11 answers

Import CSV file to strongly typed data structure in .Net

What's the best way to import a CSV file into a strongly-typed data structure?
MattH
  • 1,975
  • 6
  • 24
  • 31
107
votes
10 answers

Convert multiple rows into one with comma as separator

If I issue SELECT username FROM Users I get this result: username -------- Paul John Mary but what I really need is one row with all the values separated by comma, like this: Paul, John, Mary How do I do this?
Pavel Bastov
  • 6,911
  • 7
  • 39
  • 48
107
votes
9 answers

How to read a CSV file from a URL with Python?

when I do curl to a API call link http://example.com/passkey=wedsmdjsjmdd curl 'http://example.com/passkey=wedsmdjsjmdd' I get the employee output data on a csv file format,…
mongotop
  • 7,114
  • 14
  • 51
  • 76
106
votes
5 answers

Pandas: ValueError: cannot convert float NaN to integer

I get ValueError: cannot convert float NaN to integer for following: df = pandas.read_csv('zoom11.csv') df[['x']] = df[['x']].astype(int) The "x" is a column in the csv file, I cannot spot any float NaN in the file, and I don't understand the…
JaakL
  • 4,107
  • 5
  • 24
  • 37
106
votes
8 answers

How to write UTF-8 in a CSV file

I am trying to create a text file in csv format out of a PyQt4 QTableWidget. I want to write the text with a UTF-8 encoding because it contains special characters. I use following code: import codecs ... myfile = codecs.open(filename,…
Martin
  • 1,236
  • 3
  • 9
  • 9
105
votes
15 answers

"Line contains NULL byte" in CSV reader (Python)

I'm trying to write a program that looks at a .CSV file (input.csv) and rewrites only the rows that begin with a certain element (corrected.csv), as listed in a text file (output.txt). This is what my program looks like right now: import csv lines…
James Roseman
  • 1,614
  • 4
  • 18
  • 24
105
votes
17 answers

Least used delimiter character in normal text < ASCII 128

For coding reasons which would horrify you (I'm too embarrassed to say), I need to store a number of text items in a single string. I will delimit them using a character. Which character is best to use for this, i.e. which character is the least…
Too embarrassed to say
105
votes
11 answers

How to check encoding of a CSV file

I have a CSV file and I wish to understand its encoding. Is there a menu option in Microsoft Excel that can help me detect it OR do I need to make use of programming languages like C# or PHP to deduce it.
Vipul
  • 2,023
  • 2
  • 15
  • 12