3

When attempting to train a machine learning model using ALS in Spark's MLLib (1.4) on windows, Pyspark always terminates with a StackoverflowError. I tried adding the checkpoint as described in https://stackoverflow.com/a/31484461/36130 -- doesn't seem to help (although a new directory gets created with every run, it's always empty).

Here's the training code and stack trace:

ranks = [8, 12]
lambdas = [0.1, 10.0]
numIters = [10, 20]
bestModel = None
bestValidationRmse = float("inf")
bestRank = 0
bestLambda = -1.0
bestNumIter = -1

for rank, lmbda, numIter in itertools.product(ranks, lambdas, numIters):
    ALS.checkpointInterval = 2
    model = ALS.train(training, rank, numIter, lmbda)
    validationRmse = computeRmse(model, validation, numValidation)

    if (validationRmse < bestValidationRmse):
         bestModel = model
         bestValidationRmse = validationRmse
         bestRank = rank
         bestLambda = lmbda
         bestNumIter = numIter

testRmse = computeRmse(bestModel, test, numTest)

Stacktrace:

15/08/27 02:02:58 ERROR Executor: Exception in task 3.0 in stage 56.0 (TID 127)
java.lang.StackOverflowError
    at java.io.ObjectInputStream$BlockDataInputStream.readInt(Unknown Source)
    at java.io.ObjectInputStream.readHandle(Unknown Source)
    at java.io.ObjectInputStream.readClassDesc(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.readObject(Unknown Source)
    at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
Community
  • 1
  • 1
atVelu
  • 815
  • 1
  • 12
  • 24

1 Answers1

4

try setting checkpoint directory

sc.setCheckpointDir("/check_point_dir")
René Vogt
  • 43,056
  • 14
  • 77
  • 99