Questions tagged [readr]

readr is an R package that provides a fast and friendly way to read tabular data.

An R package written by Hadley Wickham. The goal of readr is to provide a fast and friendly way to read tabular data into R.

527 questions
2
votes
2 answers

Read a directory of txt files line by line into an R dataframe with filenames as one column

I have a directory of text files. I want to read the contents of these text files, line by line into an R dataframe. The text files contain unstructured text. The desired dataframe output is: file; line 1.txt; "line 1 in 1.txt" 1.txt; "line 2 in…
Simon Lindgren
  • 2,011
  • 12
  • 32
  • 46
2
votes
1 answer

how to specify the digits of numeric values when reading data with read.csv, read_csv or read_excel in R

I am trying to read Geographic latitude and longitude into R. These geographic data are usually numeric values with over 6 digits. I was trying to read excel file with read_excel() in "read_excel" package, and read.csv in base R, and read_csv() in…
Miao Cai
  • 902
  • 9
  • 25
2
votes
4 answers

Reading in Poor CSV File Structure

I am trying to read in a large csv datafile (delimited by ,), and I keep on getting stuck on rows such as the following: link to raw file: "http://daniels-pull.universityofdenv.netdna-cdn.com/assets/GeneralOccurrencesAll.csv" | RIN | UCR_Group |…
petergensler
  • 342
  • 2
  • 8
  • 23
2
votes
1 answer

parsing german numbers within string-vector

Having a string as follows: x <- c("31.12.2009EUR", "31.12.2009", "23.753,38", "0,00") I would like to parse it as c(NA, NA, 23753.38, 0.00) I tried: require(readr) parse_number(x, locale=locale(decimal_mark = ",")) # This ignores the…
Rentrop
  • 20,979
  • 10
  • 72
  • 100
2
votes
1 answer

dplyr : how to read a tsv file with headers while skipping some lines?

I have a simple tsv file with the following structure: 0 - headerline 1 - empty line 2 - PIG schema 3 - empty line 4 - 1-st line of DATA 5 - 2-nd line of DATA I would like to read it, possibly using readr::read_tsv but here is the problem. As you…
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235
2
votes
3 answers

can't prevent NAs for empty cells in factor columns using readr

I am trying to read file with some empty cells and getting for empty cells an expected NA. I have some special columns which can only have the values '' or '+'. So I would like to set these columns to a factor class by using read_tsv('file.txt', …
drmariod
  • 11,106
  • 16
  • 64
  • 110
2
votes
1 answer

How to read CSV with \", sequence inside quoted character value in R?

Here is a CSV file with two character columns: "key","value" "a","\"," All character values are quoted by double quotes. And there is a sequence \", inside one of the values (escaped quote plus delimiter). I cannot correctly read this file neither…
fedyakov
  • 63
  • 9
2
votes
1 answer

How to use wildcards to define col_type when using readr?

I just asked a few days ago, how to set a specific column type when using readr package. big integers when reading file with readr in r Is there a way to define the column names by wildcard? In my case, I have sometimes several columns starting with…
drmariod
  • 11,106
  • 16
  • 64
  • 110
2
votes
1 answer

big integers when reading file with readr in r

I wanted to use the readr package since I will work on some bigger files in the future. My problem is, that there is a column called Intensity which has some very big values (e.g. 5493500000). My problem is, the first time this big value appears is…
drmariod
  • 11,106
  • 16
  • 64
  • 110
2
votes
1 answer

read_fwf not working while unzipping files

I want to read in several fixed width format txt files into R but I first need to unzip them. Since they are very large files I want to use read_fwf from the readr package because it's very fast. When I do: read_fwf(unz(zipfileName, fileName),…
Warner
  • 1,353
  • 9
  • 23
2
votes
2 answers

Problems Importing txt file in R with readr instead of read.table

I'm trying to import the following text file: "year" "sex" "name" "n" "prop" "1" 1880 "F" "Mary" 7065 0.0723835869064085 "2" 1880 "F" "Anna" 2604 0.0266789611187951 "3" 1880 "F" "Emma" 2003 …
2
votes
1 answer

R: readr: how can I specify a datatype for just one (problematic) column (and not all)

Readr is a great package. But people are lazy to specify data type for each column. (out of 30 for example). Inspecting the parsing failures may reveal that only one column is the key problem. See it…
userJT
  • 11,486
  • 20
  • 77
  • 88
2
votes
1 answer

read_table() from readr package in R

I am currently attempting to use read_table() function from the readr package on a few large data files. I only want the second column so I set all the other columns NULL with this argument in the function: col_types = c(paste("_", "c",…
2
votes
1 answer

Why does readr store date objects as integer values?

When reading in csv files using the readr package date objects are stored as integer values. When I say stored as integer I don't mean the class of the date column, I mean the underlying date value R stores. This prevents the ability to use the…
Matt Mills
  • 588
  • 1
  • 6
  • 14
1
vote
1 answer

How to process multiple csv files for identifying null values in R?

I have various .csv files. Each file has multiple columns. I am using the given code in R to pursue a quality check that for a particular column, how many rows have valid values and how many are null. The code works well for a single csv file. But I…