Spark parallelise does uneven partition distribution

Question

Let's assume we have 10 nodes, each of which have 2 cores. We set our defaultParallelism to 2*10=20 hoping that each node will be assigned exactly 2 partitions if we call sc.parallelize(1 to 20). This assumption, for some reason, turns out to be incorrect in some cases. Depending on some conditions Spark sometimes places more than 2 partitions onto a single node, sometimes skipping one or many nodes altogether. This causes serious skew and repartitioning doesn't help (as we have no control over the placement of partitions onto physical nodes).

Why can this happen?
How to make sure each node gets assigned exactly 2 partitions?

Also, spark.locality.wait is set to 999999999s for what it's worth.

The DAG in which this happens is given below. While the parallelise from stage 0 assigns partitions evenly, the parallelise in stage 1 does not. It is always like this - why?

Linking related question.

I think you have misunderstood the concept of `spark.default.parallelism`. First of all, it's only applicable to RDDs (not DataFrames). Second, it only shuffles data across the cluster after the first Action is invoked. Regarding your question: it will not divide your partitions to nodes evenly. It will just make sure that manipulated output data will be partitioned into X parts (as configured). These partitions lay across the cluster. The only correlation between cores and partitions is when you perform a task (executing some computation --> Stage). — Nir Hedvat, May 26 '19 at 10:19
I am using only RDDs. Updated the question with the DAG. I am checking if spark can be used to solve a large system of linear equations (this is a custom method and I cannot use regular machine learning lib). It gives correct results but starting from a certain problem size the computations fail because of that skew. Is there absolutely no chance of getting rid of it? Why would Spark do it this way? — kboom, May 26 '19 at 13:52
According to your DAG, it appears that you are writing parallelized data (saves it, according to 'partitionBy' in Stage-2). This enforces Spark to use X partitions according to 'defaultParallelism' value (it's an Action...). This is why one is even and the other is not. Can you please publish your code? — Nir Hedvat, May 26 '19 at 14:52
Here https://github.com/kboom/iga-adi-graphx/blob/46f099599c3f90d9dff7ecb5e4a98f78ec17a356/src/main/scala/edu/agh/kboom/iga/adi/graph/solver/DirectionSolver.scala#L39 - why one is even and the other is not? Also, in both cases the number of partitions is equal to the one from defaultPartitions - only the assignment to the nodes is skewed. — kboom, May 26 '19 at 15:04

Spark parallelise does uneven partition distribution

0 Answers0