How does spark reduce work for this example?
val num = sc.parallelize(List(1,2,3))
val result = num.reduce((x, y) => x + y)
res: Int = 6
val result = num.reduce((x, y) => x + (y * 10))
res: Int = 321
I understand the 1st result (1 + 2 + 3 = 6). For the 2nd result, I thought the result would be 60 but it's not. Can someone explain?
Step1 : 0 + (1 * 10) = 10
Step2 : 10 + (2 * 10) = 30
Step3 : 30 + (3 * 10) = 60
Update: As per Spark documentation:
The function should be commutative and associative so that it can be computed correctly in parallel.
https://spark.apache.org/docs/latest/rdd-programming-guide.html