4

When I run spark applications, I see from the web-ui that some stage description are like “apply at Option.scala:120”. Why spark splits a stage on a line that is not in my spark program but a Scala library? enter image description here

user3733525
  • 231
  • 1
  • 3
  • 5

1 Answers1

0

These lines are generated in Utils.getCallStack() (GitHub link). Basically it's the method name (apply here) on the last Spark line in the stack trace plus the file name and line number (Option.scala:120) on the first non-Spark line in the stack trace.

So it looks like you make an Option.getOrElse() call, and the default value you provide is what starts the stage.

In Spark 1.1 you can get the full stack trace for each stage, taking out the guesswork from this.

Daniel Darabos
  • 26,991
  • 10
  • 102
  • 114