0

I am facing the above error while trying to consume logs from a kafka-topic, process them, and push them into solr.

The problem appears when I add the solr publishing part as I am able to consume and print either on hdfs or on the console the kafka stream. For this specific part, I followed this simple example https://github.com/mganta/streaming-data/blob/master/src/main/java/com/example/streaming/CarEventsProcessor.java as I could not make sense of the documentation of the spark-solr library from lucidworks.

    val topics = //my topics
    val kafkaParams = Map[String, Object](...)

    val stream =
      KafkaUtils.createDirectStream[String, String](
        mySparkStreamingContext,
        PreferConsistent,
        Subscribe[String, String](topics, kafkaParams)
      )

    val processed = //process stream

    def convert(field_to_process: ...): SolrInputDocument = {

      //create document to push to solr
      //test with basic document
      val doc = SolrSupport.autoMapToSolrInputDoc("", null, Map())
      doc.addField("left", "right")

      doc
    }

    SolrSupport.indexDStreamOfDocs(brokers, "table", 1, processed.map(convert))

    ssc.start()
    ssc.awaitTermination()
    ssc.stop()

I suspect it to be a dependency error. My following pow.xml:

        <spark.version>2.2.1</spark.version>
        <scala.version>2.11.8</scala.version>
        <scala.compat.version>2.11</scala.compat.version>
        <spark.solr.version>3.4.5</spark.solr.version>
        <solr.version>7.3.0</solr.version>
        <fasterxml.version>2.9.9</fasterxml.version>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.compat.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_${scala.compat.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming-kafka-0-10_${scala.compat.version}</artifactId>
            <version>2.0.0</version>
        </dependency>

        <dependency>
            <groupId>com.lucidworks.spark</groupId>
            <artifactId>spark-solr</artifactId>
            <version>${spark.solr.version}</version>
        </dependency>

        <!-- slf4j libraries -->
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
            <version>1.7.5</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.25</version>
        </dependency>

        <dependency>
            <groupId>org.apache.solr</groupId>
            <artifactId>solr-core</artifactId>
            <version>${solr.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-core</artifactId>
            <version>${fasterxml.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>${fasterxml.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-annotations</artifactId>
            <version>${fasterxml.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.module</groupId>
            <artifactId>jackson-module-scala_${scala.compat.version}</artifactId>
            <version>${fasterxml.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5</version>
        </dependency>
        <dependency>
            <groupId>org.codehaus.jackson</groupId>
            <artifactId>jackson-jaxrs</artifactId>
            <version>1.9.8</version>
        </dependency>
        <dependency>
            <groupId>org.codehaus.jackson</groupId>
            <artifactId>jackson-core-asl</artifactId>
            <version>1.9.8</version>
        </dependency>
soap
  • 1
  • 2
  • if you have already mentioned solr core ...is there any need to mentioned httpclinet?Can you check by removing httpclient? – Abhijit Bashetti May 27 '19 at 11:38
  • I tried and the application holds in ACCEPTED mode... So far, I witnessed different behaviours with the application : sometimes it runs, but no output logs, sometimes it holds in ACCEPTED mode, and some other times it fails with the above error. Could the method I used be the reason it is not working? – soap May 27 '19 at 14:58
  • I didn't understand... did it worked or not? – Abhijit Bashetti May 27 '19 at 15:26
  • No, it does not work. The same error keeps coming back in the end. – soap May 28 '19 at 08:44
  • as your adding "spark-solr" There is no need to add explicitly "solr-core"...try by removing "solr-core" dependency – Abhijit Bashetti May 28 '19 at 08:56
  • Just did it and it did nothing to improve... – soap May 28 '19 at 08:58

0 Answers0