Questions tagged [csv]

Comma-Separated Values or Character-Separated Values (CSV) is a common "flat file database" (or spreadsheet-style) format for storing tabular data in plain text, with fields separated by a special character (comma, tab, etc). Rows are typically denoted by newline characters. Use for any delimited file formats, including tab delimited (TSV)

CSV is a file format involving a plain text file with information separated by delimiters with the purpose of storing data in a table-structured format. CSV (comma separated values) files traditionally and most commonly use a comma delimiter (hence the name), but other characters can be used, such as semi-colons, tabs, pipe symbols (|), etc.

The MIME type for CSV files is text/csv.

Information is often stored in CSV format to make it easy to transfer tables of data between applications. Each row of a table is represented as a list of plain text (human-readable) values with a delimiter character between each discrete piece of data. Values may be enclosed in quotes, which is required if they contain the delimiter as a value. The first row of data often contains headers of table's columns, which describe the meaning of the data in each column.

Example

Tabular format

Time Temperature Humidity Description
08:00 70 35 Sunny and Clear
11:45 94 90 Hazy, Hot, and Humid
14:30 18 Freezing
16:00 -200 "Unliveable"

CSV format

Time,Temperature,Humidity,Description
08:00,70,35,Sunny and Clear
11:45,94,90,"Hazy, Hot, and Humid"
14:30,18,,Freezing
16:00,-200,,""Unliveable""

In this example, the first row of CSV data serves as the "header", which describes the corresponding data below it. There is no inherent way to describe within a CSV file whether the first row is a header row or not. Each successive line of the CSV file should neatly fit into the same field as the first line.

Note:

  • Empty fields (fields with no available data, such as the third field in the last line) are place-held with commas so that the fields that follow may be correctly placed.
  • Since the comma is the delimiter for fields, the commas in the Description field of the second line must be quoted (to prevent them from being interpreted as field delimiters). Wrapping the entire field in double quotes (") is the default method for protecting the delimiter character inside a field.
  • Since the double-quote is the delimiter quote character, double-quotes in the data, as in "Unliveable" on the fourth line, must also be protected. Doubling-up the double-quote is the default method for protecting the quote character inside a field.

Questions tagged are expected to relate to programming in some way, for example, parsing/importing CSV files or creating them programmatically.

Related links:

89606 questions
13
votes
2 answers

While using Export-Csv in powershell, how to exclude #TYPE Selected.System.Management.ManagementObject from the output

I am using the following PowerShell script to obtain Name, DisplayName, State, StartMode and PathName of all windows services of a local machine and then export the output to a csv file using the Export-csv cmdlet, Get-WmiObject win32_service |…
Abilash A
  • 951
  • 2
  • 8
  • 14
13
votes
3 answers

How to create a list in Python with the unique values of a CSV file?

I have CSV file that looks like the following, 1994, Category1, Something Happened 1 1994, Category2, Something Happened 2 1995, Category1, Something Happened 3 1996, Category3, Something Happened 4 1998, Category2, Something Happened 5 I want to…
Gravity M
  • 1,485
  • 5
  • 16
  • 28
13
votes
3 answers

Powershell System.Array to CSV file

I am having some difficulty getting an Export-Csv to work. I am creating an array like this... [pscustomobject] @{ Servername = $_.Servername Name = $_.Servername Blk = "" Blk2 = "" Method = "RDP" Port = "3389" } The issue…
Acerbity
  • 417
  • 1
  • 11
  • 29
13
votes
2 answers

R: Read csv with row and column name

I have a csv with column name in first row of the file, and column name in first column of the file, like this: ColName1 ColName2 ... ColNameN RowName1 .. RowName2 .. RowNameN .. If I use this command: read.csv("/Users/MNeptune/Documents/workspace…
Neptune
  • 607
  • 2
  • 8
  • 19
13
votes
6 answers

How to find the percentage of NAs in a data.frame?

I am trying to find the percentage of NAs in columns as well as inside the whole dataframe: The first method which I have commented gives me zero and the second method which is not commented gives me a matrix. Not sure what I am missing. Any hint is…
Mona Jalal
  • 34,860
  • 64
  • 239
  • 408
13
votes
4 answers

How can I declare a thousand separator in read.csv?

The dataset I want to read in contains numbers with and without a comma as thousand separator: "Sudan", "15,276,000", "14,098,000", "13,509,000" "Chad", 209000, 196000, 190000 and I am looking for a way to read this data in. Any hint appreciated!
Karsten W.
  • 17,826
  • 11
  • 69
  • 103
13
votes
1 answer

Powershell Format-Table to CSV

I have the following line in Powershell to output an array of data. The problem I am having is that Name,Title,Department do not go into columns. Instead I get a single column with each row in a single cell with tabs between. $outList | Format-Table…
devfunkd
  • 3,164
  • 12
  • 45
  • 73
13
votes
5 answers

Best practices for inserting/updating large amount of data in SQL Server 2008

I'm building a system for updating large amounts of data through various CSV feeds. Normally I would just loop though each row in the feed, do a select query to check if the item already exists and insert/update an item depending if it exists or…
Mark Clancy
  • 7,831
  • 8
  • 43
  • 49
13
votes
5 answers

How do I convert a tab-separated values (TSV) file to a comma-separated values (CSV) file in BASH?

I have some TSV files that I need to convert to CSV files. Is there any solution in BASH, e.g. using awk, to convert these? I could use sed, like this, but am worried it will make some mistakes: sed 's/\t/,/g' file.tsv > file.csv Quotes needn't be…
Village
  • 22,513
  • 46
  • 122
  • 163
13
votes
6 answers

Python regex for reading CSV-like rows

I want to parse incoming CSV-like rows of data. Values are separated with commas (and there could be leading and trailing whitespaces around commas), and can be quoted either with ' or with ". For example - this is a valid row: data1, data2 …
Tomasz Zieliński
  • 16,136
  • 7
  • 59
  • 83
13
votes
3 answers

Read a zipped .csv file in R

I have been trying hard to solve this, but I cannot get my head around how to read zipped .csv files in R. I could first unzip the files and then read them, but since the amount of unzipped data is around 22GB, I guess it is more practical to handle…
bosspe
  • 145
  • 1
  • 1
  • 7
13
votes
4 answers

Splitting a string and ignoring the delimiter inside quotes

I am using .NET's String.Split method to break up a string using commas, but I want to ignore strings enclosed in double quotes for the string. I have read that a For example, the string below. Fruit,10,"Bananas, Oranges, Grapes" I would like to…
Tachi
  • 1,416
  • 4
  • 12
  • 15
13
votes
2 answers

Python MemoryError: cannot allocate array memory

I've got a 250 MB CSV file I need to read with ~7000 rows and ~9000 columns. Each row represents an image, and each column is a pixel (greyscale value 0-255) I started with a simple np.loadtxt("data/training_nohead.csv",delimiter=",") but this gave…
stevendesu
  • 15,753
  • 22
  • 105
  • 182
13
votes
4 answers

Python's CSV reader and iteration

I have a CSV file that looks like this: "Company, Inc.",,,,,,,,,,,,10/30/09 A/R Summary Aged Analysis Report,,,,,,,,,,,,10:35:01 All Clients,,,,,,,,,,,,USER Client Account,Customer Name,15-Jan,16 - 30,31 - 60,61 - 90,91 - 120,120 -…
FrancisV
  • 1,619
  • 5
  • 21
  • 36
13
votes
6 answers

Browse for file path in python

I am trying to create a GUI with a browse window to locate a specific file. I found this question earlier: Browsing file or directory Dialog in Python although when I looked up the terms it didn't seem to be what I was looking for. All I need is…
Funkyguy
  • 628
  • 2
  • 10
  • 31