I've just started to use dataproc for doing machine learning on big data in bigquery.When i try to run this code :
df = spark.read.format('bigquery').load('bigquery-public-data.samples.shakespeare')
I get an error with some part of like this :
java.lang.ClassNotFoundException: Failed to find data source: bigquery. Please find packages at http://spark.apache.org/third-party-projects.html
I found some tutorial like in this git repo : https://github.com/GoogleCloudDataproc/spark-bigquery-connector
But idk where to write that scripts and run them.Could you make me understand?
Thanks in advance