I wish to build a matrix with 10^7
columns and 2500
rows. Since this is too large for my computer, I thought I could create the matrix iteratively. I would like to use the bigsparser
package for storing the matrix on disk.
Here is how I create the first matrix:
library(bigsparser)
library(data.table)
library(Matrix)
nvars <- 10000000 # columns
ncons <- 10 # rows
n_nonzero <- round(0.02*nvars*ncons) # approximate, there may be actually less values
set.seed(13)
# the first table
Amat <- data.frame(
i=sample.int(ncons, n_nonzero, replace=TRUE),
j=sample.int(nvars, n_nonzero, replace=TRUE),
x=runif(n_nonzero)
)
setDT(Amat)
Amat <- unique(Amat, by=c("i", "j"))
AmatSparse <- sparseMatrix(
i=Amat[,get("i")], j=Amat[,get("j")], x=Amat[,get("x")],
dims=c(2500, 10^7L)
)
AmatSFBM <- as_SFBM(AmatSparse, backingfile="sparsemat", compact = FALSE)
As you can see, I know the dimensions of the final matrix beforehand and have set it accordingly.
Now I want to add some rows, like that:
for (iter in 2:250) {
Amat <- data.frame(
i=sample.int(ncons, n_nonzero, replace=TRUE),
j=sample.int(nvars, n_nonzero, replace=TRUE),
x=runif(n_nonzero)
)
setDT(Amat)
Amat <- unique(Amat, by=c("i", "j"))
Amat[,i:=i+(iter-1)*500]
# this does not work:
AmatSFBM[Amat[,get("i")], Amat[,get("j")]] <- Amat[,get("x")]
}
However, the ]<-
operator seems not to work for SFBM
objects.
Is there any way to build a SFBM
object other than as_SFBM
from a sparse matrix? For example,
- can I add two SFBM objects of the same dimensions
- can I create a SFBM object from a CSV file or similar?
Both would be fine.