When I run spark applications, I see from the web-ui that some stage description are like “apply at Option.scala:120”. Why spark splits a stage on a line that is not in my spark program but a Scala library?
Asked
Active
Viewed 140 times
4

user3733525
- 231
- 1
- 3
- 5
1 Answers
0
These lines are generated in Utils.getCallStack()
(GitHub link). Basically it's the method name (apply
here) on the last Spark line in the stack trace plus the file name and line number (Option.scala:120
) on the first non-Spark line in the stack trace.
So it looks like you make an Option.getOrElse()
call, and the default value you provide is what starts the stage.
In Spark 1.1 you can get the full stack trace for each stage, taking out the guesswork from this.

Daniel Darabos
- 26,991
- 10
- 102
- 114