0

I have a spark cluster set up and would like to integrate spark-NLP to run Word Embeddings. I have downloaded the Glove Embeddings 6B 100 model from the model download page and placed the unzipped files in Glove. When I run the following code:

word_embeddings=WordEmbeddingsModel.load("./glove")\
    .setInputCols(["document","normal"])\
    .setOutputCol("embeddings")

in local, it works soundly but I don't know how to use it in spark-submit.

I tried some ways likes:

spark-submit  --master spark://remote-host:remote-port  --files pyspark_pex_env.pex --jars spark-nlp_2.12-4.1.0.jar --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.1.0  example.py

and share the local files by HTTP (workers can see) and load by :

word_embeddings=WordEmbeddingsModel.load("http://local-host:port/glove")\
    .setInputCols(["document","normal"])\
    .setOutputCol("embeddings")

but all of them do not work. and I don't know what I should do?

0 Answers0