1

I'm using MLLib to train a random forest. It's working fine to depth 15, but if I use depth 20 I get

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

on the driver, from the collectAsMap operation in DecisionTree.scala, around line 642. It doesn't happen until a good hour into training. I'm using 50 trees on 36 slaves with maxMemoryInMB=250, but still get an error even if I use a driver memory of 240G.

Has anybody seen this error in this context before, and can advise on what might be triggering it?

Best, Luke

zero323
  • 322,348
  • 103
  • 959
  • 935
  • Perhaps using a 64-bit JVM and `-XX:+UseCompressedOops` solves the issue. See https://stackoverflow.com/a/5497506/942774 – Hendrik May 25 '17 at 08:37

0 Answers0