0

I'm interested on using apply on a ffdf. I think I can't use ffdfdply because I'm not interested on splitting columns, because I have to use all components of the row, formed by 4 components("identifier1" "identifier2" "value" and "condition"), to fill 2 matrix depending on a condition. As this prior thread proposes (How to use apply or sapply or lapply with ffdf?) I've tried this:

apply(physical(myffdf),1,function(x){

  if( x["condition"]=="A"){
        matrix1[x["indentifier1"],x["identifier2"]] <<- x["value"]
  } ifelse( x["condition"]=="B") {
        matrix1[x["indentifier1"],x["identifier2"]] <<- x["value"]

  }
})

but I've read that physical() returns a list with atomic ff objects, so logically I can't use apply. Any suggestion?

Community
  • 1
  • 1
mfalco
  • 428
  • 3
  • 14

1 Answers1

0

It seems, you want to apply a function over the rows. You can use chunk for that. Get you chunk in RAM and use apply and store it where you want (in RAM or in ff).

require(ff)
ffiris <- as.ffdf(iris)
for(i in chunk(ffiris)){
  x <- ffiris[i, ]
  apply(x, MARGIN=1, FUN=yourfunction)
}
  • Doing this: 'for(i in chunk(myffdf)){ x <- myffdf[i, ] apply(x,1,FUN=function(x){ if( x["condition"]=="A"){ matrix1[x["indentifier1"],x["identifier2"]] <<- x["value"] } ifelse( x["condition"]=="B") { matrix1[x["indentifier1"],x["identifier2"]] <<- x["value"] } }) }' appears tha following error: Error in `[<-`(`*tmp*`, x["identifier1"], x["identifier2"], value = "SERINC1") : subscript out of bounds. But I've checked that "SERINC1"(which is an identifier2) exists in both matrix1 and 2. – mfalco Apr 10 '15 at 14:29
  • Looks like your FUN contains bugs. Fix it. Also I advise you not to use <<-. It is bad practice to use that inside a function unless you know what you're doing. –  Apr 10 '15 at 14:44
  • I've checked the bugs and now it's running, thanks. BTW.Do you know any good tutorial for ff package or ffdf objects? I've been struggling with the CRAN pdf :S – mfalco Apr 10 '15 at 16:37