1

I am very new to Zeppelin/spark and couldn't get an accurate description of steps to configure new dependencies like that of NLP libraries. Found similar issue here.

I was trying to use Johnsnowlabs NLP library in Zeppelin notebook (spark version2.2.1). Setup included :

  1. In Zeppelin's Interpreters configurations for Spark, include the following artifact: com.johnsnowlabs.nlp:spark-nlp_2.11:2.5.4
  2. Then, in conf/zeppelin-env.sh, setup SPARK_SUBMIT_OPTIONS. export SPARK_SUBMIT_OPTIONS=” — packages JohnSnowLabs:spark-nlp:2.2.2". Then restarted Zeppelin.

But the below program gives the error :

%spark
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._

<console>:26: error: object johnsnowlabs is not a member of package com
       import com.johnsnowlabs.nlp.base._
                  ^
<console>:27: error: object johnsnowlabs is not a member of package com
       import com.johnsnowlabs.nlp.annotator._

Can someone please share how this can be done? I referred this link . TIA

1 Answers1

0

you don't need to edit the conf/zeppelin-env.sh (anyway you're using it incorrectly, as you're specifying completely different version), you can make all changes via Zeppelin UI. Go to the Spark interpreter configuration, and put com.johnsnowlabs.nlp:spark-nlp_2.11:2.5.4 into spark.jars.packages configuration property (or add it if it doesn't exist), and into the Dependencies at the end of configuration (for some reason, it isn't automatically pulled into driver classpath).

Alex Ott
  • 80,552
  • 8
  • 87
  • 132