I have an input file of 849MB. When I read this file in pyspark shell using sc.textFile() and check the no. of partitions, it is 27. I have another file of size 2.60GB and for this file the no. of partitions is 84. It seems that value of dfs.block.size is 32MB which satisfies all these values. I am running locally with 4 cores.
But when I checked dfs.block.size it was 128MB. I don't know what's happening and how my pyspark shell is calculating the number of partitions.