Questions tagged [readr]

readr is an R package that provides a fast and friendly way to read tabular data.

An R package written by Hadley Wickham. The goal of readr is to provide a fast and friendly way to read tabular data into R.

527 questions
1
vote
2 answers

Function to group by and plot? - R

I'm new to R and I'm doing the R course from DataQuest. I have a csv of forest fires. The file can be downloaded here: https://archive.ics.uci.edu/ml/machine-learning-databases/forest-fires/ I want to create a function that groups the data by "x"…
Julian
  • 411
  • 4
  • 18
1
vote
0 answers

Fast (specific) word count in a 70 GB text file on a single laptop

I have a raw text file weighing 70GB, over 1B rows of differing length, no columns involved, raw text. I wish to scan it and simply count how many times each word of a predefined set search_words appears (size ~100). Currently I'm using…
Giora Simchoni
  • 3,487
  • 3
  • 34
  • 72
1
vote
0 answers

R (read_csv2) converts column to logical and replaces values with NAs in unbalanced dataset

I have a csv file with 24 columns and 271,691 rows that I read into R like this: df <- read_csv2("C:/FinData/Testfile.csv") Everything works fine except for one column that R converts to col_logical() even though it contains 44 cells with numerical…
Economist
  • 173
  • 8
1
vote
1 answer

Read multiple pages of a pdf with read_lines

I'm using pdftools to import text into R from a pdf, and readr to read it in line by line. It works for the first page but stops there. It seems like it would be so simple to read in all pages of a document and yet I get the same result with…
votmoyd
  • 65
  • 8
1
vote
2 answers

readr::read_csv() - parsing failure with nested quotations

I have a csv where some columns have a quoted column with another quotation inside it: "blah blah "nested quote"" and it generates parsing failures. I'm not sure if this is a bug or there is an argument to deal with this? Reprex (file is here or…
jzadra
  • 4,012
  • 2
  • 26
  • 46
1
vote
3 answers

How to tell readr::read_csv to guess double column correctly

I have runoff data with a lot of zero values and occasionally some non-zero double values. 'readr::read_csv' guesses integer column type because of the many zeros. How can I make read_csv to guess the correct double column type? I do not know the…
Thomas Wutzler
  • 255
  • 1
  • 9
1
vote
1 answer

How to define column specification for similarly named column with readr?

I have a data base with 250 columns and want to read only 50 of them instead of loading all of them then dropping columns with dplyr::select. I suppose I can do that using a column specification. I don't want to type the column specification…
Romain
  • 1,931
  • 1
  • 13
  • 24
1
vote
1 answer

Having trouble reading an .csv database

I'm trying to read in a .csv file using readr::read_csv readr::read_csv("my_file.csv") But I got the following error: Parsed with column specification: cols( col_character() ) Error in read_tokens_(data, tokenizer, col_specs, col_names, locale_, …
1
vote
1 answer

R Studio read xlsx file, head() function doesn't display all the decimals of the values stored

I am trying to read a scraped dataset stored in excel file. Once I loaded the file in RStudio, I checked the head(dataset, 5), but the columns of double values doesn't show full decimals, instead it shows: 100. However, if I use View(dataset), it…
ACuriousCat
  • 1,003
  • 1
  • 8
  • 21
1
vote
1 answer

readr and write_csv: double precision numbers and grisu3

Sometimes, when I save a columns of double precision numbers to a csv using write_csv from readr (part of the tidyverse), the following happens:a double like 285121.15 is written as 285121.14999999997. The original value has only two decimals and…
larry77
  • 1,309
  • 14
  • 29
1
vote
1 answer

Forcing read_delim in readr to treat multiple " and \ as part of column string

Given a ; delimited file of structure: colA; colB; colC 1;A; 10 2;B; 11 3;C"; 12 4;D""; 15 5;"F";20 6;K"""; 21 7;""M";22 8; \""O;23 I would like to ensure that colB is always imported verbatim as a character string. In particular, I would like…
Konrad
  • 17,740
  • 16
  • 106
  • 167
1
vote
1 answer

import data in r with more precision

I want to import a data which is in CSV format into r-studio The data is in this manner 1732.7193603515625 ,7825.7729492187500 1732.7191162109375 ,7825.7714843750000 1732.7191162109375 ,7825.7714843750000 1732.7191162109375 ,7825.7714843750000…
1
vote
2 answers

write_csv Scientific notation depending on trailing "000"?

Writing a csv with the write_csv() function from package readr seems to treat numbers differently depending on trailing zeros. 4001705344 is saved as is, but 4100738000 is saved as 4100738e3 in the csv. This causes problems when I reopen the csv…
gplngr
  • 77
  • 7
1
vote
2 answers

Read a text file with readr where a quote ends rows

I have a text file that looks something like this: a,b,c,d "string1","string2","string3"," "string4","string5","string6"," The file itself is comma separated, but each line ends with a double quote (i.e., not the comma delimiter).…
Chris
  • 51
  • 5
1
vote
0 answers

How to make tibble saved with write_tsv readable by read_tsv

I have quite large tibble() (data.frame) which I save with write_tsv() and would like to read with read_tsv(). I am using all default options. However, read_tsv() emits a bunch of warnings (See example below). What strategy could I use to make it…
witek
  • 984
  • 1
  • 8
  • 25