I have a textFile
in and RDD like so: sc.textFile(<file_name>)
.
I try to repartition the RDD in order to speed up processing:
sc.repartition(<n>)
.
No matter what I put in for <n>
, it does not seem to change, as indicated by:
RDD.getNumPartitions()
always prints the same number (3)
no matter what.
How do I change the number of partitions to increase performance?