2

I am following community URL https://github.com/spark-redshift-community/spark-redshift#python to connect with Redshift and it seems to use avro dependencies although i am not using avro as input source data format. My scala is 2.12 and dependencies i have added are :

        <artifactId>spark-avro_2.12</artifactId>
        <version>3.1.2</version> 
    </dependency>
   
    <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro-mapred</artifactId>
        <version>1.7.7</version>
        <classifier>hadoop2</classifier>
    </dependency>

But this is the exception I am getting:

User class threw exception: org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide" at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:676) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:743) at

Alex Ott
  • 80,552
  • 8
  • 87
  • 132

0 Answers0