0

Let's say that I am trying to generate a large empty matrix of zeros that I can fill from the data (e.g. count data)

in the package ff

require(ff)
require(ffdf)

If there are 15,000 columns (variables) and 20 rows (observations), I could do the following

ffdf.object = ffdf( ff(0, dim = c(20, 15000)) )

I thought the point of ff was to load much larger datasets. For example:

> test = matrix(0, nrow = 1000000, ncol = 15000)
Error: cannot allocate vector of size 111.8 Gb

but ff gives roughly the same problem, that the total dimensions of the matrix cannot be larger than .Machine$integer.max

> test = ff(0, dim = c(1000000, ncol = 15000))
Error in if (length < 0 || length > .Machine$integer.max) stop("length must      be between 1 and .Machine$integer.max") :   
 missing value where TRUE/FALSE needed
In addition: Warning message:
In ff(0, dim = c(1e+06, ncol = 15000)) :
  NAs introduced by coercion to integer range

Is there an easy way to create a large (eg 1M by 15k) ffdf in R? Alternately is there an easy way to make the largest possible matrix ffdf and then rbind additional rows (with working code. both rbind and ffdfappend have not worked so far for me)?

Brian Jackson
  • 409
  • 1
  • 5
  • 16

1 Answers1

1

You could make an SQL database. Check out the RSQLite package.

thc
  • 9,527
  • 1
  • 24
  • 39