In SparkR I have a DataFrame data
and it containd id
as well. I also have a liste= 2 9 12 102 154 ... 1451
where length(liste)=3001
. I want entries in data
where id equals liste. In sparkR I do this
newdata <- unionAll(filter(data, data$id == liste[1] ), filter(data, data$id == liste[2] ))
for(j in 3:10){
newdata <- unionAll(newdata, filter(data, data$id==good[j] ))
}
For these 10 iterations it takes long time, about 5min. When I want to do all iterations, namely 3001, sparkR say "error returnstatus==0 is not true". How should one solve this?