I have a large data.frame of 20M lines. This data frame is not only numeric, there is characters as well. Using a split and conquer concept, I want to split this data frame to be executed in a parallel way using snow package (parLapply function, specifically). The problem is that the nodes run out of memory because the data frame parts are worked in RAM. I looked for a package to help me with this problem and I found just one (considering the multi type data.frame): ff package. Another problem comes from the use of this package. The split result of a ffdf is not equal to a split of a commom data.frame. Thus, it is not possible to run the parLapply function.
Do you know other packages for this goal? Bigmemory only supports matrix.