0

I have a Cloudera distribution of Hadoop, Spark etc where the Spark-Kafka version is 0.8 (i.e. spark-streaming-kafka-0-8_2.11).

The issue is, version 0.8 of the Apache Spark with Kafka Integration has Kafka version 0.8.2.1 built inside and I require 0.10.0.1.

Is there a way to work around this? I do not want to use spark-streaming-kafka-0-10_2.11 because it is not a stable version.

I tried adding this to my maven dependencies (packaging with the jars) but the classpath is taking precedence over my maven dependencies.

   <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
        <version>2.3.0.cloudera1</version>
        <exclusions>
            <exclusion>
                <groupId>org.apache.kafka</groupId>
                <artifactId>kafka_2.11</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka_2.11</artifactId>
        <version>0.10.0.1</version>
    </dependency>
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245

1 Answers1

0

You will need to put the kafka dependency above the spark dependency so it looks something like this:

<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka_2.11</artifactId>
    <version>0.10.0.1</version>
</dependency>

<dependency>
     <groupId>org.apache.spark</groupId>
     <artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
     <version>2.3.0.cloudera1</version>
     <exclusions>
         <exclusion>
             <groupId>org.apache.kafka</groupId>
             <artifactId>kafka_2.11</artifactId>
         </exclusion>
     </exclusions>
</dependency>
BeyondPerception
  • 534
  • 1
  • 6
  • 10