I use pySpark to write parquet file. I would like to change the hdfs block size of that file. I set the block size like this and it doesn't work:
sc._jsc.hadoopConfiguration().set("dfs.block.size", "128m")
Does this have to be set before starting the pySpark job? If so, how to do it.