0

Let's assume we have 10 nodes, each of which have 2 cores. We set our defaultParallelism to 2*10=20 hoping that each node will be assigned exactly 2 partitions if we call sc.parallelize(1 to 20). This assumption, for some reason, turns out to be incorrect in some cases. Depending on some conditions Spark sometimes places more than 2 partitions onto a single node, sometimes skipping one or many nodes altogether. This causes serious skew and repartitioning doesn't help (as we have no control over the placement of partitions onto physical nodes).

  • Why can this happen?
  • How to make sure each node gets assigned exactly 2 partitions?

Also, spark.locality.wait is set to 999999999s for what it's worth.

The DAG in which this happens is given below. While the parallelise from stage 0 assigns partitions evenly, the parallelise in stage 1 does not. It is always like this - why?

DAG

Linking related question.

kboom
  • 2,279
  • 3
  • 28
  • 43
  • I think you have misunderstood the concept of `spark.default.parallelism`. First of all, it's only applicable to RDDs (not DataFrames). Second, it only shuffles data across the cluster after the first Action is invoked. Regarding your question: it will not divide your partitions to nodes evenly. It will just make sure that manipulated output data will be partitioned into X parts (as configured). These partitions lay across the cluster. The only correlation between cores and partitions is when you perform a task (executing some computation --> Stage). – Nir Hedvat May 26 '19 at 10:19
  • I am using only RDDs. Updated the question with the DAG. I am checking if spark can be used to solve a large system of linear equations (this is a custom method and I cannot use regular machine learning lib). It gives correct results but starting from a certain problem size the computations fail because of that skew. Is there absolutely no chance of getting rid of it? Why would Spark do it this way? – kboom May 26 '19 at 13:52
  • 1
    According to your DAG, it appears that you are writing parallelized data (saves it, according to 'partitionBy' in Stage-2). This enforces Spark to use X partitions according to 'defaultParallelism' value (it's an Action...). This is why one is even and the other is not. Can you please publish your code? – Nir Hedvat May 26 '19 at 14:52
  • Here https://github.com/kboom/iga-adi-graphx/blob/46f099599c3f90d9dff7ecb5e4a98f78ec17a356/src/main/scala/edu/agh/kboom/iga/adi/graph/solver/DirectionSolver.scala#L39 - why one is even and the other is not? Also, in both cases the number of partitions is equal to the one from defaultPartitions - only the assignment to the nodes is skewed. – kboom May 26 '19 at 15:04

0 Answers0