0

Normally, when I run pyspark with graphframes I have to use this command:

pyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12

In the first time run this, this will install the packages graphframes but not the next time. In the .bashrc file, I have already added:

export SPARK_OPTS="--packages graphframes:graphframes:0.8.1-spark3.0-s_2.12"

But I cannot import the packages if I am not adding the option --packages.

How can I run pyspark with graphframes with this simple command?

pyspark
huy
  • 1,648
  • 3
  • 14
  • 40

1 Answers1

1

you can make a wrapper script like myspark.sh that triggers pyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12, that would be the simplest solution.

pltc
  • 5,836
  • 1
  • 13
  • 31
  • Yes that is the simplest way but I don't want to add the `--packages` after `pyspark` or `spark-submit` every time I submit a job. – huy May 19 '21 at 12:22