0

I am looking for using KCL on SparkStreaming using pySpark. Any pointers would be helpful.

I tried few given by spark Kinesis Ingeration link.

But i get the error for JAVA class reference.

Seems Python is using JAVA class.

i tried linking spark-streaming-kinesis-asl-assembly_2.10-2.0.0-preview.jar while trying to apply the KCL app on spark.

but still having the error.

Please let me know if anyone has done it already.

if i search online i get more about Twitter and Kafka. Not able to get much help with regard to Kinesis.

spark verision used: 1.6.3

Giri
  • 35
  • 5
  • i tried with spark-streaming-kinesis-asl_2.10-1.6.3.jar the error i got is ...Caused by: java.lang.ClassNotFoundException: com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream – Giri Jul 07 '17 at 10:33

1 Answers1

0

I encountered the same problem. The kinesis-asl jar had several files missing.

To overcome this problem, I had included the following jars in my spark-submit.

  1. amazon-kinesis-client-1.9.0.jar
  2. aws-java-sdk-1.11.310.jar
  3. jackson-dataformat-cbor-2.6.7.jar

Note: I am using Spark 2.3.0 so the jar versions listed might not be the same as those you should be using for your spark version.

Hope this helps.

chwps
  • 23
  • 1
  • 3