Questions tagged [csv]

Comma-Separated Values or Character-Separated Values (CSV) is a common "flat file database" (or spreadsheet-style) format for storing tabular data in plain text, with fields separated by a special character (comma, tab, etc). Rows are typically denoted by newline characters. Use for any delimited file formats, including tab delimited (TSV)

CSV is a file format involving a plain text file with information separated by delimiters with the purpose of storing data in a table-structured format. CSV (comma separated values) files traditionally and most commonly use a comma delimiter (hence the name), but other characters can be used, such as semi-colons, tabs, pipe symbols (|), etc.

The MIME type for CSV files is text/csv.

Information is often stored in CSV format to make it easy to transfer tables of data between applications. Each row of a table is represented as a list of plain text (human-readable) values with a delimiter character between each discrete piece of data. Values may be enclosed in quotes, which is required if they contain the delimiter as a value. The first row of data often contains headers of table's columns, which describe the meaning of the data in each column.

Example

Tabular format

Time Temperature Humidity Description
08:00 70 35 Sunny and Clear
11:45 94 90 Hazy, Hot, and Humid
14:30 18 Freezing
16:00 -200 "Unliveable"

CSV format

Time,Temperature,Humidity,Description
08:00,70,35,Sunny and Clear
11:45,94,90,"Hazy, Hot, and Humid"
14:30,18,,Freezing
16:00,-200,,""Unliveable""

In this example, the first row of CSV data serves as the "header", which describes the corresponding data below it. There is no inherent way to describe within a CSV file whether the first row is a header row or not. Each successive line of the CSV file should neatly fit into the same field as the first line.

Note:

  • Empty fields (fields with no available data, such as the third field in the last line) are place-held with commas so that the fields that follow may be correctly placed.
  • Since the comma is the delimiter for fields, the commas in the Description field of the second line must be quoted (to prevent them from being interpreted as field delimiters). Wrapping the entire field in double quotes (") is the default method for protecting the delimiter character inside a field.
  • Since the double-quote is the delimiter quote character, double-quotes in the data, as in "Unliveable" on the fourth line, must also be protected. Doubling-up the double-quote is the default method for protecting the quote character inside a field.

Questions tagged are expected to relate to programming in some way, for example, parsing/importing CSV files or creating them programmatically.

Related links:

89606 questions
122
votes
8 answers

Writing a dictionary to a csv file with one line for every 'key: value'

I've got a dictionary: mydict = {key1: value_a, key2: value_b, key3: value_c} I want to write the data to a file dict.csv, in this style: key1: value_a key2: value_b key3: value_c I wrote: import csv f = open('dict.csv','wb') w =…
user1106770
  • 1,389
  • 4
  • 12
  • 14
122
votes
7 answers

Specifying colClasses in the read.csv

I am trying to specify the colClasses options in the read.csv function in R. In my data, the first column time is basically a character vector, while the rest of the columns are numeric. data <- read.csv("test.csv", comment.char="" , …
defoo
  • 5,159
  • 11
  • 34
  • 39
121
votes
4 answers

Export result set on Dbeaver to CSV

Normally I use Dbeaver for windows and always export my result set like this: Run my query --> select the result --> export the result set --> select export to clipboard --> done This step by step puts my result set in my clipboard and I can paste…
Rafael Meirelles
  • 1,323
  • 2
  • 8
  • 10
121
votes
19 answers

Python CSV error: line contains NULL byte

I'm working with some CSV files, with the following code: reader = csv.reader(open(filepath, "rU")) try: for row in reader: print 'Row read successfully!', row except csv.Error, e: sys.exit('file %s, line %d: %s' % (filename,…
AP257
  • 89,519
  • 86
  • 202
  • 261
121
votes
16 answers

Java lib or app to convert CSV to XML file?

Is there an existing application or library in Java which will allow me to convert a CSV data file to XML file? The XML tags would be provided through possibly the first row containing column headings.
A Salim
  • 1,325
  • 2
  • 11
  • 16
120
votes
4 answers

Sort CSV file by multiple columns using the "sort" command

I have a CSV-like file, and I would like to sort it by column priority, like "ORDER BY" in SQL. For example, given the following rows, 3;1;2 1;3;2 1;2;3 2;3;1 2;1;3 3;2;1 If "ORDER BY" were column2, column1, column3, the result would…
Rafael Orágio
  • 1,718
  • 5
  • 19
  • 26
120
votes
12 answers

How to export collection to CSV in MongoDB?

How do you export all the records in a MongoDB collection to a .csv file? mongoexport --host localhost --db dbname --collection name --type=csv > test.csv This asks me to specify name of the fields I need to export. Can I just export all the fields…
Succeed Stha
  • 1,687
  • 2
  • 12
  • 13
120
votes
11 answers

start index at 1 for Pandas DataFrame

I need the index to start at 1 rather than 0 when writing a Pandas DataFrame to CSV. Here's an example: In [1]: import pandas as pd In [2]: result = pd.DataFrame({'Count': [83, 19, 20]}) In [3]: result.to_csv('result.csv', index_label='Event_id') …
Clark Fitzgerald
  • 1,305
  • 2
  • 10
  • 7
119
votes
12 answers

How to convert CSV file to multiline JSON?

Here's my code, really simple stuff... import csv import json csvfile = open('file.csv', 'r') jsonfile = open('file.json', 'w') fieldnames = ("FirstName","LastName","IDNumber","Message") reader = csv.DictReader( csvfile, fieldnames) out =…
BeanBagKing
  • 2,003
  • 4
  • 18
  • 24
118
votes
4 answers

Pandas read_csv dtype read all columns but few as string

I'm using Pandas to read a bunch of CSVs. Passing an options json to dtype parameter to tell pandas which columns to read as string instead of the default: dtype_dic= { 'service_id':str, 'end_date':str, ... } feedArray = pd.read_csv(feedfile , dtype…
Nikhil VJ
  • 5,630
  • 7
  • 34
  • 55
118
votes
4 answers

Pythonically add header to a csv file

I wrote a Python script merging two csv files, and now I want to add a header to the final csv. I tried following the suggestions reported here and I got the following error: expected string, float found. What is the most pythonic way to fix…
albus_c
  • 6,292
  • 14
  • 36
  • 77
118
votes
1 answer

How do I import a CSV file in R?

I have a .csv file in my workstation. How can I open that file in R and do statistical calculation?
Uselesssss
  • 2,127
  • 6
  • 28
  • 37
116
votes
17 answers

Convert XLS to CSV on command line

How could I convert an XLS file to a CSV file on the windows command line. The machine has Microsoft Office 2000 installed. I'm open to installing OpenOffice if it's not possible using Microsoft Office.
Joel
  • 11,431
  • 17
  • 62
  • 72
115
votes
6 answers

How to split CSV files as per number of rows specified?

I've CSV file (around 10,000 rows ; each row having 300 columns) stored on LINUX server. I want to break this CSV file into 500 CSV files of 20 records each. (Each having same CSV header as present in original CSV) Is there any linux command to…
Pawan Mude
  • 1,609
  • 6
  • 19
  • 32
115
votes
3 answers

How do you remove the column name row when exporting a pandas DataFrame?

Say I import the following Excel spreadsheet into a dataframe: Val1 Val2 Val3 1 2 3 5 6 7 9 1 2 How do I delete the column name row (in this case Val1, Val2, Val3) so that I can export a csv with no column names, just the…
cmgerber
  • 2,199
  • 3
  • 16
  • 15