I'm trying to output the results from a practice NLP model created using Spark-NLP. However, I keep getting the error below. Can anyone help me out here. The .show() method works earlier in the code, when I attempt to output the dataframe. It just…
As per https://nlp.johnsnowlabs.com/docs/en/licensed_install, the command to install spark-nlp-jsl is as below.
pip install -q spark-nlp-jsl==${version} --extra-index-url https://pypi.johnsnowlabs.com/${secret.code} --upgrade
I tried by providing a…
My NLP pipeline uses pre-trained BERT embedding model "bert_base_uncased" from johnsnowlabs. But while loading this downloaded model I am getting following exception.
Caused by: java.util.NoSuchElementException: Param poolingLayer does not exist.
…
Downloaded T5-small model from SparkNLP website, and using this code (almost entirely from the examples):
import com.johnsnowlabs.nlp.SparkNLP
import com.johnsnowlabs.nlp.annotators.seq2seq.T5Transformer
import…
I am getting a pickling error with a spark UDF I wrote. It applies the spark pipeline on each row of the data frame and returns the class(it is a boolean value, True or False).
The following pipeline is working for the data added in the list. I have…
The Spark NLP Installation instructions:
https://nlp.johnsnowlabs.com/docs/en/install
have several different methods for installing Spark NLP, in a few different languages. They have instructions for installing it with GPU support, but not for…
I have a working Pyspark installation running through Jupyter on a Ubuntu VM.
Only one Java version (openjdk version "1.8.0_265"), and I can I can run a local Spark (v2.4.4) session like this without problems:
import pyspark
from pyspark.sql import…
How can I install offline Spark NLP packages without internet connection.
I've downloaded the package (recognizee_entities_dl) and uploaded it to the cluster.
I've installed Spark NLP using pip install spark-nlp==2.5.5.
I'm using PySpark and from…
I am very new to Zeppelin/spark and couldn't get an accurate description of steps to configure new dependencies like that of NLP libraries.
Found similar issue here.
I was trying to use Johnsnowlabs NLP library in Zeppelin notebook (spark…
I am trying to follow the official examples from John Snow Labs but every time I get a TypeError: 'JavaPackage' object is not callable error. I followed all of the steps in the Databricks install documentation but no matter what walkthrough I try,…
So far, I pre-process text data using numpy and build-in fuctions (such as keras tokenizer class, tf.keras.preprocessing.text.Tokenizer: https://keras.io/api/preprocessing/text/).
And there is were I got stuck:
Since I am trying to scale up my model…
I am using Sparknlp to annotate a long text file in databrick. My code is like this:
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
val lines = sc.textFile("/FileStore/tables/48320_0-3f0d3.txt")
import…
I have a huge text file and I have to extract only named entites in from this file. I am using Scala language and Databricks cluster for this.
val input = sc.textFile('....Mypath...').flatMap(line => line.split("""\W+"""))
val namedEnt =…
I have managed to get the BERT model to work on johnsnowlabs-spark-nlp library. I am able to save the "trained model" on disk as follows.
Fit Model
df_bert_trained = bert_pipeline.fit(textRDD)
df_bert=df_bert_trained.transform(textRDD)
save…