Questions tagged [johnsnowlabs-spark-nlp]

John Snow Labs’ NLP is a natural language processing tool built on top of Apache Spark ML pipelines

External links

Related tags:

100 questions
1
vote
2 answers

requirement failed: Wrong or missing inputCols annotators in johnsnowlabs.nlp

I'm using com.johnsnowlabs.nlp-2.2.2 with spark-2.4.4 to process some articles. In those articles, there are some very long words I'm not interested in and which slows down the POS tagging a lot. I would to like to exclude them after the…
ticapix
  • 1,534
  • 1
  • 11
  • 15
1
vote
1 answer

I get 'Task not serializable' when I try to run the John Snow spark-nlp example in Scala

I have been trying to run the John Snow Spark-NLP example from this repository: https://github.com/JohnSnowLabs/spark-nlp/blob/master/example/src/TrainViveknSentiment.scala on my local machine. But it throws the org.apache.spark.SparkException:…
1
vote
1 answer

Not able to use JohnSnowLabs pretrained model in Zeppelin

I want to use the JohnSnowLabs pretrained spell check module in my Zeppelin notebook. As mentioned here I have added com.johnsnowlabs.nlp:spark-nlp_2.11:1.7.3 to the Zeppelin dependency section as shown below: However, when I try to run the…
user3243499
  • 2,953
  • 6
  • 33
  • 75
0
votes
0 answers

Use Spark NLP and Pyspark in Kaggle notebook with Internet off

Can I please seek your help on how to install and use Spark NLP and Pyspark in Kaggle notebook when the internet is disabled? I have already attempted myself quite a number of times, but unfortunately, I am still not able to get it worked. Your…
gracenz
  • 137
  • 1
  • 10
0
votes
1 answer

Pyspark use DocumentAssembler on array

I am trying to use DocumentAssembler for array of strings. The documentation says: "The DocumentAssembler can read either a String column or an Array[String])". But when I do a simple example: data = spark.createDataFrame([[["Spark NLP is an…
0
votes
0 answers

How to install Spark NLP on Azure Synapse Spark Pools?

I want to install Spark-NLP on Apache Spark Pools on Azure Synapse Analytics. I added the spark_nlp-4.4.0-py2.py3-none-any.whl & spark-nlp_2.12-4.4.0.jar as workspace packages. Workspace configuration runs without errors and can import SparkNLP…
0
votes
0 answers

Use custom vocab or vocab from pretrained model to tokenize text in scala

I want to use this vocab and then tokenize and normalize my text inputs. I am new to scala and hence was not able add vocab to tokenizer api. The default tokenizer works very well. How do I build custom tokenizer and normalize stages?
teksan
  • 142
  • 13
0
votes
1 answer

TypeError: Cannot recognize a pipeline stage of type

Can I combine sparknlp with pyspark? I have a data (of tweets) consists of two category features "keyword" and "location", and one free textual "text". I am trying to build a a sentence embeddings using GoogleUniversalSentenceEncoder, and add two…
Eli Borodach
  • 554
  • 3
  • 9
  • 22
0
votes
0 answers

Trying to concat two sentence embedding using SparkNLP

I have two sentences (questions) data, and a label where they mean the same or not (is_duplicate). What I am trying to do, is to build a model above two universal sentence encoders, which will classify whether they are equal or not. Here is my…
Eli Borodach
  • 554
  • 3
  • 9
  • 22
0
votes
1 answer

How to set Tokenizer() function of Spark NLP to split tokens by comma?

I'm building a pipeline in Spark NLP (version 3.2.1) to create Tokens from a string column that contains searched words by words separated by comma. documentAssemblerteste = DocumentAssembler() \ .setInputCol("searched_terms")…
0
votes
0 answers

Spark NLP Evaluation (com.johnsnowlabs.nlp.eval._) for Scala 2.12?

Spark NLP Evaluation com.johnsnowlabs.nlp.eval._ for Scala 2.12.x not available. With spark version 3.x, it's not working with libraryDependency "com.johnsnowlabs.nlp" %% "spark-nlp-eval" % "2.2.2" in sbt. There is no repo for Scala version 2.12.x…
Faaiz
  • 635
  • 8
  • 18
0
votes
1 answer

libraryDependencies for `TFNerDLGraphBuilder()` for Spark with Scala

Can anyone tell what is libraryDependencies for TFNerDLGraphBuilder() for Spark with Scala? It gives me error, Cannot resolve symbol TFNerDLGraphBuilder I see it works for notebook as given…
Faaiz
  • 635
  • 8
  • 18
0
votes
0 answers

sparkcontext error while running sparknlp.start()

when i import sparknlp and run sparknlp.start() I get the following error pyspark version 3.1.2 sparknlp version is 4.2.2 Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. :…
0
votes
1 answer

Install spark-nlp with GPU

I'm newbie in pyspark and spark-nlp and i want to use spark-nlp in docker container with GPU support on WSL-2 Windows 10. After installing spark-nlp I can use pretrained models and pipelines, but there is no difference between CPU and GPU speed.…
0
votes
0 answers

can't run pretrained spark-nlp model from local to spark server

I have a spark cluster set up and would like to integrate spark-NLP to run Word Embeddings. I have downloaded the Glove Embeddings 6B 100 model from the model download page and placed the unzipped files in Glove. When I run the following…