When running a spark job on an AWS cluster I believe that I've changed my code properly to distribute both the data and the work of the algorithm I am using. But the output looks like this:
[Stage 3:> (0 + 2) / 1000]
[Stage 3:> (1 + 2) / 1000]
[Stage 3:> (2 + 2) / 1000]
[Stage 3:> (3 + 2) / 1000]
[Stage 3:> (4 + 2) / 1000]
[Stage 3:> (5 + 2) / 1000]
[Stage 3:> (6 + 2) / 1000]
[Stage 3:> (7 + 2) / 1000]
[Stage 3:> (8 + 2) / 1000]
[Stage 3:> (9 + 2) / 1000]
[Stage 3:> (10 + 2) / 1000]
[Stage 3:> (11 + 2) / 1000]
[Stage 3:> (12 + 2) / 1000]
[Stage 3:> (13 + 2) / 1000]
[Stage 3:> (14 + 2) / 1000]
[Stage 3:> (15 + 2) / 1000]
[Stage 3:> (16 + 2) / 1000]
Am I correct to interpret the 0 + 2 / 1000 as only one two core processor carrying out one of the 1000 tasks at a time? With 5 nodes (10 processors) why wouldn't I see 0 + 10 / 1000?