Using dataproc image version 2.0.x in google cloud since delta 0.7.0 is available in this dataproc image version. However, this dataproc instance comes with pyspark 3.1.1 default, Apache Spark 3.1.1 has not been officially released yet. So there is no version of Delta Lake compatible with 3.1 yet hence suggested to downgrade.
I have tried the below,
pip install --force-reinstall pyspark==3.0.1
executed the above command as a root user on master node of dataproc instance, however, when I check the pyspark --version
it is still showing 3.1.1
how to fix the default pyspark version to 3.0.1?