Size of a random forest model in MLlib

Asked Feb 02 '16 at 09:31

Active Apr 25 '16 at 11:17

Viewed 204 times

I have to compute and to keep in memory several (e.g. 20 or more) random forests model with Apache Spark.

I have only 8 GB available on the driver of the yarn cluster I use to launch the job. And I am faced to OutOfMemory errors because models do not fit in memory. I have already decreased the ratio spark.storage.memoryFraction to 0.1 to try to increase the non-RDD memory.

I have thus two questions:

How could I make these models fit in memory?
What could I check the size of my models?

EDIT

I have 200 executors which have 8GB of space.

I am not sure my models live in the driver but I suspect it as I get OutOfMemory errors and I have plenty of space in the executors. Furthermore, I stock these models in Arrays

edited Apr 25 '16 at 11:17

zero323

322,348
103
959
935

asked Feb 02 '16 at 09:31

Pop

12,135
5
55
68

This question needs more details. What is the size of your executors? How many executors do you have? Why do your models live in the driver? – marios Feb 02 '16 at 16:57

Size of a random forest model in MLlib

0 Answers0