Questions tagged [vroom]

24 questions
0
votes
0 answers

RStudio read_delim(): intermittently receive error std::bad_alloc upon opening files with unusual delimeter

I received a series of 100+ files from a client. This client received the files as part of litigation, so they didn't have to be transmitted in a convenient fashion, they just all had to be present. In a single .zip file, all the files are all…
Dave
  • 1
  • 1
0
votes
1 answer

full join error in R since switching from read.csv

I've just swapped out read.csv for vroom and since then my full_join is not working. It throws following error Error in [.data.table(y, x, nomatch = if (all.x) NA else NULL, on = by, : logical error. i is not a data.table, but 'on' argument is…
Magnetar
  • 85
  • 8
0
votes
0 answers

Why does col_type 'Date' not work in Vroom?

I have a very large datafile with 2 date columns in the format of YYYY-MM-DD I used vroom to read df <-vroom("path....csv", col_types=c(ID="c",address="c",date_from="D",date_to="D") typeof(df$date_from) --- this returns 'dbl' I don't understand why…
Hong
  • 73
  • 8
0
votes
3 answers

Converting 7 or 8 digit numbers to dates in R

I am importing a very large fixed-width dataset into R and wish to use vroom for much better speed. However, the dates in this dataset are in numeric format with either 7 or 8 digits, depending on whether the day of the month has 1 or 2 digits…
Richard Berry
  • 396
  • 1
  • 9
0
votes
0 answers

R Combine use of new readr/vroom lazy loading + dplyr AND dtplyr/data.table?

I am loading a large dataset that I need to filter approximately 1/20th of the rows and then group_by by 5 columns and summarize 3 remaining ones. This page https://vroom.r-lib.org/articles/benchmarks.html says sampling, filtering, and grouped…
Arthur Yip
  • 5,810
  • 2
  • 31
  • 50
0
votes
0 answers

Is there a limit to the number of observations that dplyr filter can successfully detect?

I am working with a large dataset with over 200 million rows. I load the dataset using the vroom package to speed up processing time. When I filter the dataset using an %in% condition, the process misses observations. I am wondering if there is a…
SolarSon
  • 11
  • 4
0
votes
1 answer

Performant implementation of function for converting data.frame to delimited string in R

I am looking for a fast serialization function to convert a data.frame to a delimited string in R. At the moment I am using readr::format_tsv (Versions readr_2.0.0 vroom_1.5.3) for the conversion and I am wondering if there is a faster…
Matthias Munz
  • 3,583
  • 4
  • 30
  • 47
0
votes
1 answer

Compressing files through vroom and piping to pixz

I would like to write the mtcars data.frame to file using compression (xz in my case) using the vroom package and pixz but I can't get it to write a file. It complains that pixz does not exist even though it is installed locally. According to the…
HCAI
  • 2,213
  • 8
  • 33
  • 65
0
votes
1 answer

Why is vroom so slow?

I have a simple operation where I read several csvs, bind them, and then export, but vroom is performing much slower than other methods. I must be doing something wrong, but I'm not sure what, or…
Kyouma
  • 320
  • 6
  • 14
1
2