I am trying to write a HDInsight Spark application which reads streaming data from an Azure EventHub. I am using a Zeppelin notebook with the Livy interpreter.
I need to import the dependency
com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.2
and to do that I add it to the
livy.spark.jars.packages
property of the Livy interpreter. However, this breaks my code. Even without the line
import org.apache.spark.eventhubs._
I still get a failure. (I don't use wildcard imports usually, but this is just a proof of concept application)
The error I am getting is
org.apache.zeppelin.livy.LivyException: Session 8 is finished, appId: application_[NUMBER], log: [ ApplicationMaster RPC port: -1, queue: default, start time: 1533304077387, final status: UNDEFINED, tracking URL: http://[LIVY_SERVER_HOSTNAME]:8088/proxy/application_[NUMBER]/, user: livy, 18/08/03 13:47:57 INFO ShutdownHookManager: Shutdown hook called, 18/08/03 13:47:57 INFO ShutdownHookManager: Deleting directory /tmp/spark-[id],
YARN Diagnostics: , Application killed by user.]
at org.apache.zeppelin.livy.BaseLivyInterpreter.createSession(BaseLivyInterpreter.java:300)
at org.apache.zeppelin.livy.BaseLivyInterpreter.initLivySession(BaseLivyInterpreter.java:184)
at org.apache.zeppelin.livy.LivySharedInterpreter.open(LivySharedInterpreter.java:57)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.livy.BaseLivyInterpreter.getLivySharedInterpreter(BaseLivyInterpreter.java:165)
at org.apache.zeppelin.livy.BaseLivyInterpreter.open(BaseLivyInterpreter.java:139)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:493)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I suspect this is really not a problem with Livy, or Zeppelin, but just some configuration I have set wrongly, or that I need to change from the default settings, possibly to do with downloading the jar.
Any help would be appreciated