Questions tagged [ff]

An R package that provides memory-efficient storage of large data on disk and fast access functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory.

More information:

165 questions
0
votes
2 answers

how to make arithmatic operations in ffdf object of ff package

I have the script making a ffdf object: library(ff) library(ffbase) setwd("D:/My_package/Personal/R/reading") x<-cbind(rnorm(1:100000000),rnorm(1:100000000),1:100000000) system.time(write.csv2(x,"test.csv",row.names=FALSE)) system.time(x <-…
Dimon D.
  • 438
  • 5
  • 23
0
votes
1 answer

ffdf object consumes extra RAM (in GB)

I have decided to test the key advantage of ff package - RAM minimal allocation (PC specs: i5, RAM 8Gb, Win7 64 bit, Rstudio). According to the package discription we can manipulate physical objects (files) like virtual ones as if they are allocated…
Dimon D.
  • 438
  • 5
  • 23
0
votes
0 answers

R: Big data: Determine string length

My data looks like below with millions of lines. This text can be copied into a text file and read in for my example below. @HISEQ:104:C7Y3WACXX:4:1101:1307:1946 1:N:0:CGATGT NTCCGGTAGTGTAGCAGATCGGAAGAGCACACGTCTGAACTCCAGTCACC + …
mindlessgreen
  • 11,059
  • 16
  • 68
  • 113
0
votes
0 answers

How to sum up columns in table.ff or how to convert it to doable form

What's the 'nature' of a table.ff object in r? dim of table.ff is N ULL, and typically it is used for frequency measures. I could not find any funtion to add all columns together in order to do some statistics on resultant ' numeric vector'. str of…
0
votes
1 answer

Best way to handle big dataset in R

I have to run some regression models and descriptives on a big dataset. I have a folder of around 500 files (update: txt files) which I would like to merge, and are in total 250GB. I know how to merge all files from a folder, but although I am…
research111
  • 347
  • 5
  • 18
0
votes
1 answer

How to efficiently calculate a covariance matrix from an ff_matrix

I have a large matrix (1,000,000 rows by 1,140 columns) which I'm storing using the ff package. Is there an efficient way to calculate a covariance matrix from this? Using the cov function gives the error: Error in cov(X) : supply both 'x' and 'y'…
M. Berk
  • 189
  • 1
  • 6
0
votes
0 answers

If statement for ffdf object

I have an ffdf object x that holds a dataset with variables y and z. Due to millions of rows it needs to be stored as an ffdf object. My question is this: I want to create a new variable within this object q, that is dependent on the value of z. z…
skeletonnoire
  • 81
  • 1
  • 1
  • 3
0
votes
1 answer

Remove white space from ff_object in R

I have an ff object. One of the columns, which is a string variable, has white spaces, and I want to remove these. I have tried the following: 1). newcol <- gsub("[[:space:]]", "", mydata$mystr) 2). newcol<- as.ffdf(gsub("[[:space:]]", "",…
skeletonnoire
  • 81
  • 1
  • 1
  • 3
0
votes
1 answer

Converting "individual clock in/out time logs" to "total occupancy of building over time" efficiently

So I have data in .csv form showing the time which specific users walks into and out of a building over a few months. I am trying to use R to tabulate the building occupancy every 15/30 minutes for analysis. The data has been cleaned and is in the…
ethane
  • 319
  • 3
  • 7
  • 13
0
votes
2 answers

How to DROP columns from ffdf object ? (R)

Could I easily drop column of ffdf object ? library(ff);library(ffbase) irisdf=as.ffdf(iris) How to contain only Sepal.length and Species columns ?
Qbik
  • 5,885
  • 14
  • 62
  • 93
0
votes
1 answer

(R language) How to create empty ff data frame

everyone. What I'm trying to do To create an empty ff data.frame in R. Details I'd like to read multiple csv files in R, bind them together and create one big data.frame. Since the data are very huge, I'm using ff package. Here is my code. file_list…
dixhom
  • 2,419
  • 4
  • 20
  • 36
0
votes
1 answer

Current status of colClasses argument in function ff:read.csv.ffdf (ff - R package)

Error vmode 'character' not implemented occours due to argument colClasses=c("id"="character") in below code : df <- read.csv.ffdf('TenGBsample.csv', colClasses=c("id"="character"), VERBOSE=TRUE) read.table.ffdf 1..1000 (1000) …
Qbik
  • 5,885
  • 14
  • 62
  • 93
0
votes
1 answer

assigning value to row.names of ffdf

I have a ff dataframe variable whose name is created within the code, at each iteration of the loop. I want to set the rownames of this variable to NULL, however the below code doesn't work. Could somebody please suggest a…
NickD1
  • 393
  • 1
  • 4
  • 14
0
votes
1 answer

How can I 'Split' my data set in R?

I've been trying for quite some time to get my test data to split. > FDF <- read.csv.ffdf(file='C:\\Users\\William\\Desktop\\R Data\\TestData0812.txt', header = FALSE, colClasses=c('factor','factor','numeric','numeric','numeric','numeric'),…
user1587280
0
votes
1 answer

How can I create a POSIXct vector in ffdf?

I've had a look around and can't quite seem to get a grasp of is going on with this. I'm using R in Eclipse. The file I'm trying to import is 700mb with around 15mil rows and 6 columns. As I was having problems loading in I have started using the ff…
user1587280