I'm using com.johnsnowlabs.nlp-2.2.2 with spark-2.4.4 to process some articles. In those articles, there are some very long words I'm not interested in and which slows down the POS tagging a lot.
I would to like to exclude them after the…
I have been trying to run the John Snow Spark-NLP example from this repository:
https://github.com/JohnSnowLabs/spark-nlp/blob/master/example/src/TrainViveknSentiment.scala
on my local machine. But it throws the org.apache.spark.SparkException:…
I want to use the JohnSnowLabs pretrained spell check module in my Zeppelin notebook. As mentioned here I have added com.johnsnowlabs.nlp:spark-nlp_2.11:1.7.3 to the Zeppelin dependency section as shown below:
However, when I try to run the…
Can I please seek your help on how to install and use Spark NLP and Pyspark in Kaggle notebook when the internet is disabled?
I have already attempted myself quite a number of times, but unfortunately, I am still not able to get it worked. Your…
I am trying to use DocumentAssembler for array of strings. The documentation says: "The DocumentAssembler can read either a String column or an Array[String])".
But when I do a simple example:
data = spark.createDataFrame([[["Spark NLP is an…
I want to install Spark-NLP on Apache Spark Pools on Azure Synapse Analytics.
I added the spark_nlp-4.4.0-py2.py3-none-any.whl & spark-nlp_2.12-4.4.0.jar as workspace packages.
Workspace configuration runs without errors and can import SparkNLP…
I want to use this vocab and then tokenize and normalize my text inputs. I am new to scala and hence was not able add vocab to tokenizer api. The default tokenizer works very well.
How do I build custom tokenizer and normalize stages?
Can I combine sparknlp with pyspark? I have a data (of tweets) consists of two category features "keyword" and "location", and one free textual "text". I am trying to build a a sentence embeddings using GoogleUniversalSentenceEncoder, and add two…
I have two sentences (questions) data, and a label where they mean the same or not (is_duplicate). What I am trying to do, is to build a model above two universal sentence encoders, which will classify whether they are equal or not.
Here is my…
I'm building a pipeline in Spark NLP (version 3.2.1) to create Tokens from a string column that contains searched words by words separated by comma.
documentAssemblerteste = DocumentAssembler() \
.setInputCol("searched_terms")…
Spark NLP Evaluation com.johnsnowlabs.nlp.eval._ for Scala 2.12.x not available. With spark version 3.x, it's not working with libraryDependency "com.johnsnowlabs.nlp" %% "spark-nlp-eval" % "2.2.2" in sbt. There is no repo for Scala version 2.12.x…
Can anyone tell what is libraryDependencies for TFNerDLGraphBuilder() for Spark with Scala? It gives me error, Cannot resolve symbol TFNerDLGraphBuilder
I see it works for notebook as given…
when i import sparknlp and run
sparknlp.start()
I get the following error
pyspark version 3.1.2
sparknlp version is 4.2.2
Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
:…
I'm newbie in pyspark and spark-nlp and i want to use spark-nlp in docker container with GPU support on WSL-2 Windows 10.
After installing spark-nlp I can use pretrained models and pipelines, but there is no difference between CPU and GPU speed.…
I have a spark cluster set up and would like to integrate spark-NLP to run Word Embeddings. I have downloaded the Glove Embeddings 6B 100 model from the model download page and placed the unzipped files in Glove. When I run the following…