I am using the doc and trying to run a simple script found here: https://docs.snowflake.com/en/user-guide/spark-connector-use.html
Py4JJavaError: An error occurred while calling o37.load.
: java.lang.ClassNotFoundException: Failed to find data source: net.snowflake.spark.snowflake.
My code below. I also tried to set config option with the path to the jdbc and spark-snowflake jars located in /Users/Hana/spark-sf/
directory but no luck.
from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark import SparkConf, SparkContext
spark = SparkSession \
.builder \
.appName("Python Spark SQL basic example") \
.config('spark.jars','/Users/Hana/spark-sf/snowflake-jdbc-3.12.9.jar,/Users/Hana/spark-sf/spark-snowflake_2.12-2.8.1-spark_3.0.jar') \
.getOrCreate()
# Set options below
sfOptions = {
"sfURL" : "<account_name>.snowflakecomputing.com",
"sfUser" : "<user_name>",
"sfPassword" : "<password>",
"sfDatabase" : "<database>",
"sfSchema" : "<schema>",
"sfWarehouse" : "<warehouse>"
}
SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"
df = spark.read.format(SNOWFLAKE_SOURCE_NAME) \
.options(**sfOptions) \
.option("query", "select * from table limit 200") \
.load()
df.show()
How should I properly be setting variables? And which ones are needed to set? If someone can help to list out these steps I would greatly appreciate it!