1

I have a data.frame of moderate size, and I tested storing and retrieving it to a network drive using both rds (uncompressed) and feather format. But the result shows that while write_feather is much faster than saveRDS, read_feather is much slower than readRDS.

Question(s). Does this have something to do with the particular network configuration of my workplace (i.e. is it just me)? or does it have something to do with the innate ability to handle remote files of read_feather and readRDS? Shall I stick to rds for now?

> print(object.size(impdata),unit="auto")
364.4 Mb

## SAVING
> system.time(feather::write_feather(impData,path="M:/waangData/test.feather"))
   user  system elapsed 
   0.52    0.16    4.80 
> system.time(saveRDS(impData,file="M:/waangData/Data4predictImp.rds",compress=F))
   user  system elapsed 
   4.23    2.35   28.61 

## READING
> system.time({t2=feather::read_feather("M:/waangData/test.feather")})
   user  system elapsed 
   0.59    1.54  134.39 
> system.time({t=readRDS("M:/waangData/Data4predictImp.rds")})
   user  system elapsed 
   2.36    0.61   19.59 
qoheleth
  • 2,219
  • 3
  • 18
  • 23
  • 1
    why dont you time it using local copies of those files first? then any unexplained timings will be due to the network. when network is slow, your file sizes should be as small as possible. you might also want to check out `fst` package as well. – chinsoon12 Apr 19 '17 at 07:29
  • Thanks for the reply. I have clarified the question. No doubt the fact that the file is remote is slowing down things. My question is really, is it some innate design limitation of `read_feather` that makes it slower than `readRDS` when it comes to reading remote files. – qoheleth Apr 19 '17 at 23:56
  • @qoheleth I have noticed this too. Reading `feather` objects into R is quite slow even when reading it locally, with sizes of 10-15MB. Have you found any solutions or reasons why this is? – guy Aug 10 '17 at 12:51

0 Answers0