0

As I learned the spark AQE (Adaptive Query Execution) is taking care of the spark data frame partition dynamically at the runtime (if shuffling).

Therefore do we still need to concern about "manually" repartition?

And, does the processed data frame partition number relates to the number of current parallelism (spark.sparkContext.defaultParallelism) or the input dataframe's partitions?

QPeiran
  • 1,108
  • 1
  • 8
  • 18

0 Answers0