Questions tagged [csv]

Comma-Separated Values or Character-Separated Values (CSV) is a common "flat file database" (or spreadsheet-style) format for storing tabular data in plain text, with fields separated by a special character (comma, tab, etc). Rows are typically denoted by newline characters. Use for any delimited file formats, including tab delimited (TSV)

CSV is a file format involving a plain text file with information separated by delimiters with the purpose of storing data in a table-structured format. CSV (comma separated values) files traditionally and most commonly use a comma delimiter (hence the name), but other characters can be used, such as semi-colons, tabs, pipe symbols (|), etc.

The MIME type for CSV files is text/csv.

Information is often stored in CSV format to make it easy to transfer tables of data between applications. Each row of a table is represented as a list of plain text (human-readable) values with a delimiter character between each discrete piece of data. Values may be enclosed in quotes, which is required if they contain the delimiter as a value. The first row of data often contains headers of table's columns, which describe the meaning of the data in each column.

Example

Tabular format

Time Temperature Humidity Description
08:00 70 35 Sunny and Clear
11:45 94 90 Hazy, Hot, and Humid
14:30 18 Freezing
16:00 -200 "Unliveable"

CSV format

Time,Temperature,Humidity,Description
08:00,70,35,Sunny and Clear
11:45,94,90,"Hazy, Hot, and Humid"
14:30,18,,Freezing
16:00,-200,,""Unliveable""

In this example, the first row of CSV data serves as the "header", which describes the corresponding data below it. There is no inherent way to describe within a CSV file whether the first row is a header row or not. Each successive line of the CSV file should neatly fit into the same field as the first line.

Note:

  • Empty fields (fields with no available data, such as the third field in the last line) are place-held with commas so that the fields that follow may be correctly placed.
  • Since the comma is the delimiter for fields, the commas in the Description field of the second line must be quoted (to prevent them from being interpreted as field delimiters). Wrapping the entire field in double quotes (") is the default method for protecting the delimiter character inside a field.
  • Since the double-quote is the delimiter quote character, double-quotes in the data, as in "Unliveable" on the fourth line, must also be protected. Doubling-up the double-quote is the default method for protecting the quote character inside a field.

Questions tagged are expected to relate to programming in some way, for example, parsing/importing CSV files or creating them programmatically.

Related links:

89606 questions
13
votes
5 answers

How to join two tables using a comma-separated-list in the join field

I have two tables, categories and movies. In movies table I have a column categories. That column consists of the categories that movie fits in. The categories are IDs separated by a comma. Here's an example: Table categories { -id- -name- …
Katie
  • 173
  • 1
  • 1
  • 6
13
votes
3 answers

Proper way to reset csv.reader for multiple iterations?

Having an issue with a custom iterator in that it will only iterate over the file once. I am calling seek(0) on the relevant file object in between iterations, but StopIteration is thrown on the first call to next() on the 2nd run through. I feel I…
Derek Reynolds
  • 3,473
  • 3
  • 25
  • 34
13
votes
3 answers

How to load Pickle file in chunks?

Is there any option to load a pickle file in chunks? I know we can save the data in CSV and load it in chunks. But other than CSV, is there any option to load a pickle file or any python native file in chunks?
Naren Babu R
  • 453
  • 2
  • 9
  • 33
13
votes
6 answers

Line breaks in generated csv file driving me crazy

I'm trying to make an export of some data i have (stored in a datatable). Some of those values have a linebreak in them. Now every time i try and import the file in excel (2010), the linbreaks get recognised as a new row, instead of an actual…
Melle Groenewoud
  • 133
  • 1
  • 1
  • 5
13
votes
6 answers

Ignore header line when parsing CSV file

How can the header line of the CSV file be ignored in ruby on rails while doing the CSV parsing!! Any ideas
Deepak Lamichhane
  • 19,076
  • 4
  • 30
  • 42
13
votes
3 answers

run powershell command using csv as input

I have a csv that looks like Name, email, address Name, email, address Name, email, address I am wanting to run New-Mailbox -Name "*Name*" -WindowsLiveID *email* -ImportLiveId (where *x* is replaced by the value from the csv). on each line…
Hailwood
  • 89,623
  • 107
  • 270
  • 423
13
votes
2 answers

How to see the progress bar of read_csv

I'm trying to read 100GB size of csv file I want to see the profess bar when they reading file file = pd.read_csv("../code/csv/file.csv") like =====> 30% is there way to see the progress bar when reading the read_csv? or other files
user11173832
13
votes
5 answers

Delphi TQuery save to csv file

I want to export content of a TQuery to a CSV file without using a 3d part component(Delphi 7). From my knowledge this can not be accomplished with Delphi standard components. My solution was to save the content in a StringList with a CSV format,…
RBA
  • 12,337
  • 16
  • 79
  • 126
13
votes
6 answers

Best way to work with large amounts of CSV data quickly

I have large CSV datasets (10M+ lines) that need to be processed. I have two other files that need to be referenced for the output—they contain data that amplifies what we know about the millions of lines in the CSV file. The goal is to output a new…
NJ.
  • 2,155
  • 6
  • 26
  • 35
13
votes
5 answers

How to keep null values when writing to csv

I'm writing data from sql server into a csv file using Python's csv module and then uploading the csv file to a postgres database using the copy command. The issue is that Python's csv writer automatically converts Nulls into an empty string "" and…
Jonathan Porter
  • 1,365
  • 7
  • 34
  • 62
13
votes
6 answers

Python pandas to_csv causes OSError: [Errno 22] Invalid argument

My code is the following: import pandas as pd import numpy as np df = pd.read_csv("path/to/my/infile.csv") df = df.sort_values(['distance', 'time']) df.to_csv("path/to/my/outfile.csv") this code reads from infile.csv which is a 3GB csv file…
João Matos
  • 6,102
  • 5
  • 41
  • 76
13
votes
4 answers

Count separators in CSV rows with Pandas

I have a csv file as follows: name,age something tom,20 And when I put it into a dataframe it looks like: df = pd.read_csv('file', header=None) 0 1 1 name age 2 something NaN 3 tom 20 How would I get the…
user10503628
13
votes
1 answer

How to use python csv.DictReader with a binary file? (For a babel custom extraction method)

I'm trying to write a custom extraction method for babel, to extract strings from a specific column in a csv file. I followed the documentation here. Here is my extraction method code: def extract_csv(fileobj, keywords, comment_tags, options): …
tiagosilva
  • 1,695
  • 17
  • 31
13
votes
1 answer

Zip list of files python

I have created some csv files in my code and I would like to zip them as one folder to be sent by e-mail. I already have the e-mail function but the problem is to zip. I tried to use this: here I am not extracting or find the files in a directory.…
may
  • 1,073
  • 4
  • 14
  • 31
13
votes
7 answers

Parse Remote CSV File using Nodejs / Papa Parse?

I am currently working on parsing a remote csv product feed from a Node app and would like to use Papa Parse to do that (as I have had success with it in the browser in the past). Papa Parse Github: https://github.com/mholt/PapaParse My initial…
Necevil
  • 2,802
  • 5
  • 25
  • 42