Questions tagged [csv]

Comma-Separated Values or Character-Separated Values (CSV) is a common "flat file database" (or spreadsheet-style) format for storing tabular data in plain text, with fields separated by a special character (comma, tab, etc). Rows are typically denoted by newline characters. Use for any delimited file formats, including tab delimited (TSV)

CSV is a file format involving a plain text file with information separated by delimiters with the purpose of storing data in a table-structured format. CSV (comma separated values) files traditionally and most commonly use a comma delimiter (hence the name), but other characters can be used, such as semi-colons, tabs, pipe symbols (|), etc.

The MIME type for CSV files is text/csv.

Information is often stored in CSV format to make it easy to transfer tables of data between applications. Each row of a table is represented as a list of plain text (human-readable) values with a delimiter character between each discrete piece of data. Values may be enclosed in quotes, which is required if they contain the delimiter as a value. The first row of data often contains headers of table's columns, which describe the meaning of the data in each column.

Example

Tabular format

Time Temperature Humidity Description
08:00 70 35 Sunny and Clear
11:45 94 90 Hazy, Hot, and Humid
14:30 18 Freezing
16:00 -200 "Unliveable"

CSV format

Time,Temperature,Humidity,Description
08:00,70,35,Sunny and Clear
11:45,94,90,"Hazy, Hot, and Humid"
14:30,18,,Freezing
16:00,-200,,""Unliveable""

In this example, the first row of CSV data serves as the "header", which describes the corresponding data below it. There is no inherent way to describe within a CSV file whether the first row is a header row or not. Each successive line of the CSV file should neatly fit into the same field as the first line.

Note:

  • Empty fields (fields with no available data, such as the third field in the last line) are place-held with commas so that the fields that follow may be correctly placed.
  • Since the comma is the delimiter for fields, the commas in the Description field of the second line must be quoted (to prevent them from being interpreted as field delimiters). Wrapping the entire field in double quotes (") is the default method for protecting the delimiter character inside a field.
  • Since the double-quote is the delimiter quote character, double-quotes in the data, as in "Unliveable" on the fourth line, must also be protected. Doubling-up the double-quote is the default method for protecting the quote character inside a field.

Questions tagged are expected to relate to programming in some way, for example, parsing/importing CSV files or creating them programmatically.

Related links:

89606 questions
14
votes
1 answer

How to write a csv with a comma as the decimal separator?

I am trying to create a european-formatted csv in python. I already set the separator to a semicolon writer = csv.writer(response, delimiter=';', quoting=csv.QUOTE_ALL) However, this still uses dot . as the decimal separator. What's the correct way…
maniexx
  • 625
  • 1
  • 12
  • 28
14
votes
1 answer

Python list to csv throws error: iterable expected, not numpy.int64

I want to write a list into a csv,When trying to do it I receive the below error out.writerows(fin_city_ids) _csv.Error: iterable expected, not numpy.int64 My code is as below…
arpit joshi
  • 1,987
  • 8
  • 36
  • 62
14
votes
17 answers

"CSV file does not exist" for a filename with embedded quotes

I am currently learning Pandas for data analysis and having some issues reading a csv file in Atom editor. When I am running the following code: import pandas as pd df = pd.read_csv("FBI-CRIME11.csv") print(df.head()) I get an error message,…
Aleksei Nabatov
  • 151
  • 1
  • 1
  • 5
14
votes
2 answers

Formatting output of CSV file in Python

I am creating a very rudimentary "Address Book" program in Python. I am grabbing contact data from a CSV file, the contents of which looks like the following example: Name,Phone,Company,Email Elon Musk,454-6723,SpaceX,emusk@spacex.com Larry…
shaneybrainy13
  • 151
  • 1
  • 1
  • 6
14
votes
3 answers

Can I append to a compressed stream with pandas?

I know that by passing the compression='gzip' argument to pd.to_csv() I can save a DataFrame into a compressed CSV file. my_df.to_csv('my_file_name.csv', compression='gzip') I also know that if I want to append a DataFrame to the end of an existing…
Eric Hansen
  • 1,749
  • 2
  • 19
  • 39
14
votes
1 answer

Python generator to read large CSV file

I need to write a Python generator that yields tuples (X, Y) coming from two different CSV files. It should receive a batch size on init, read line after line from the two CSVs, yield a tuple (X, Y) for each line, where X and Y are arrays (the…
d.grassi84
  • 383
  • 1
  • 2
  • 11
14
votes
1 answer

ignoring rows with unmatching dtype in pandas

I'm specifying dtypes while reading a huge CSV in pandas: pd.read_csv('29_2016/data.csv', error_bad_lines=False, encoding='utf-8', dtype={'a': str, 'b': np.float64, 'c':np.float64}, …
Abhishek Thakur
  • 16,337
  • 15
  • 66
  • 97
14
votes
4 answers

What is a convenient way to store and retrieve boolean values in a CSV file

If I store a boolean value using the CSV module, it gets converted to the strings True or False by the str() function. However, when I load those values, a string of False evaluates to being True because it's a non-empty string. I can work around it…
Simon Hibbs
  • 5,941
  • 5
  • 26
  • 32
14
votes
2 answers

Cassandra .csv import error:batch too large

I'm trying to import data from a .csv file to Cassandra 3.2.1 via copy command.In the file are only 299 rows with 14 columns. I get the Error: Failed to import 299 rows: InvalidRequest - code=2200 [Invalid query] message="Batch too large" I used the…
Emlon
  • 183
  • 1
  • 9
14
votes
3 answers

Python Pandas - Read csv file containing multiple tables

I have a single .csv file containing multiple tables. Using Pandas, what would be the best strategy to get two DataFrame inventory and HPBladeSystemRack from this one file ? The input .csv looks like this: Inventory System Name IP…
JahMyst
  • 1,616
  • 3
  • 20
  • 39
14
votes
3 answers

Efficient read and write CSV in Go

The Go code below reads in a 10,000 record CSV (of timestamp times and float values), runs some operations on the data, and then writes the original values to another CSV along with an additional column for score. However it is terribly slow (i.e.…
BoltzmannBrain
  • 5,082
  • 11
  • 46
  • 79
14
votes
3 answers

How to parse a CSV file that might have one of two delimiters?

In my case, valid CSV are ones delimited by either comma or semi-colon. I am open to other libraries, but it needs to be Java. Reading through the Apache CSVParser API, the only thing I can think is to do this which seems inefficient and…
Coder1224
  • 1,785
  • 2
  • 17
  • 21
14
votes
9 answers

Parsing a CSV file using gawk

How do you parse a CSV file using gawk? Simply setting FS="," is not enough, as a quoted field with a comma inside will be treated as multiple fields. Example using FS="," which does not work: file contents: one,two,"three, four",five "six,…
MCS
  • 22,113
  • 20
  • 62
  • 76
14
votes
3 answers

Exporting SQLite Database to csv file in android

I am trying to export SQLite data to SD card in android as a CSV file on a directory. So i have tried this method below and apparently it only shows this text printed out: FIRST TABLE OF THE DATABASE DATE,ITEM,AMOUNT,CURRENCY In my DBHelper.java i…
Steve Kamau
  • 2,755
  • 10
  • 42
  • 73
14
votes
4 answers

View row values in openpyxl

In the csv module in python, there is a function called csv.reader which allows you to iterate over a row, returns a reader object and can be held in a container like a list. So when the list assigned to a variable and is printed, ie: csv_rows =…
dyao
  • 983
  • 3
  • 12
  • 25