I am trying to import a large SAS dataset (75.5 million rows, sas7bdat) into RStudio in a way in which I can work with the whole dataset. After doing some digging and talking to some people it sounds like what I want to do is read in the file without loading it into memory, which is why I was attempting to use chunked::read_csv_chunkwise
as this suggests:
https://mnoorfawi.github.io/handling-big-files-with-R/
I used SAS to export the dataset as a csv file. v8_mc.csv
. Then in R:
library(chunked)
library(dplyr)
## Here we don't read the file, we just get something like a pointer to it.
data_chunked <- read_csv_chunkwise("B:/SAS_DATASETS/v8_mc.csv",
skip = 1,stringsAsFactors = FALSE,
header = TRUE,sep = ",")
But I get the following warning:
In FUN(X[[i]], ...) : Unsupported type 'logical'; using default type 'string'
The documentation said that head()
should work with the chunked object, so I said what the heck and tried:
> head(data_chunked)
Error in .local(x, ...) :
Conversion to int failed; line=957; column=52; string='V061'
I've never used SAS before and I'm a total newb to big data in R. Since I can't open the sas or csv file in R, I can't figure out how to make a reproducible example. I'd welcome help to make this a better question.
Thanks!