8

I often work with comma separated values, and was curious to the differences between read_csv() and read.csv().

Are there any practical differences that could shine light on the situational usage of both?

Jaap
  • 81,064
  • 34
  • 182
  • 193
mhovd
  • 3,724
  • 2
  • 21
  • 47

2 Answers2

7

Quoted from the introduction page.

11.2.1 Compared to base R

If you’ve used R before, you might wonder why we’re not using read.csv(). There are a few good reasons to favour readr functions over the base equivalents:

They are typically much faster (~10x) than their base equivalents. Long running jobs have a progress bar, so you can see what’s happening. If you’re looking for raw speed, try data.table::fread(). It doesn’t fit quite so well into the tidyverse, but it can be quite a bit faster.

They produce tibbles, they don’t convert character vectors to factors*, use row names, or munge the column names. These are common sources of frustration with the base R functions.

They are more reproducible. Base R functions inherit some behaviour from your operating system and environment variables, so import code that works on your computer might not work on someone else’s.


*Note that from R 4.0.0

R [...] uses a stringsAsFactors = FALSE default, and hence by default no longer converts strings to factors in calls to data.frame() and read.table().

Henrik
  • 65,555
  • 14
  • 143
  • 159
vanao veneri
  • 970
  • 2
  • 12
  • 31
5

read_csv() reads comma delimited numbers. It reads 1,000 as 1000.

original numbers

read by read_csv

read by read.csv

Jingyi Ren
  • 51
  • 1
  • 2