Questions tagged [johnsnowlabs-spark-nlp]

John Snow Labs’ NLP is a natural language processing tool built on top of Apache Spark ML pipelines

External links

Related tags:

100 questions
1
vote
1 answer

Sparknlp Java Error While Trying to Display Model Results

I'm trying to output the results from a practice NLP model created using Spark-NLP. However, I keep getting the error below. Can anyone help me out here. The .show() method works earlier in the code, when I attempt to output the dataframe. It just…
Clovis
  • 183
  • 1
  • 8
1
vote
1 answer

Error in installation of spark NLP for Healthcare

As per https://nlp.johnsnowlabs.com/docs/en/licensed_install, the command to install spark-nlp-jsl is as below. pip install -q spark-nlp-jsl==${version} --extra-index-url https://pypi.johnsnowlabs.com/${secret.code} --upgrade I tried by providing a…
1
vote
2 answers

"Param poolingLayer does not exist" error coming while loading BERT embedding model in spark-nlp

My NLP pipeline uses pre-trained BERT embedding model "bert_base_uncased" from johnsnowlabs. But while loading this downloaded model I am getting following exception. Caused by: java.util.NoSuchElementException: Param poolingLayer does not exist. …
dev.ak
  • 29
  • 4
1
vote
1 answer

Cannot use SparkNLP pre-trained T5Transformer, executor fails with error "No Operation named [encoder_input_ids] in the Graph"

Downloaded T5-small model from SparkNLP website, and using this code (almost entirely from the examples): import com.johnsnowlabs.nlp.SparkNLP import com.johnsnowlabs.nlp.annotators.seq2seq.T5Transformer import…
shay__
  • 3,815
  • 17
  • 34
1
vote
1 answer

PicklingError: Could not serialize object in Pyspark

I am getting a pickling error with a spark UDF I wrote. It applies the spark pipeline on each row of the data frame and returns the class(it is a boolean value, True or False). The following pipeline is working for the data added in the list. I have…
joel
  • 1,156
  • 3
  • 15
  • 42
1
vote
0 answers

Use Conda to install Spark NLP with GPU support?

The Spark NLP Installation instructions: https://nlp.johnsnowlabs.com/docs/en/install have several different methods for installing Spark NLP, in a few different languages. They have instructions for installing it with GPU support, but not for…
SHB11
  • 375
  • 5
  • 14
1
vote
0 answers

Cannot run spark-nlp due to Exception: Java gateway process exited before sending its port number

I have a working Pyspark installation running through Jupyter on a Ubuntu VM. Only one Java version (openjdk version "1.8.0_265"), and I can I can run a local Spark (v2.4.4) session like this without problems: import pyspark from pyspark.sql import…
LukasKawerau
  • 1,071
  • 2
  • 23
  • 42
1
vote
1 answer

How to install offline Spark NLP packages

How can I install offline Spark NLP packages without internet connection. I've downloaded the package (recognizee_entities_dl) and uploaded it to the cluster. I've installed Spark NLP using pip install spark-nlp==2.5.5. I'm using PySpark and from…
John Doe
  • 9,843
  • 13
  • 42
  • 73
1
vote
1 answer

object johnsnowlabs is not a member of package com

I am very new to Zeppelin/spark and couldn't get an accurate description of steps to configure new dependencies like that of NLP libraries. Found similar issue here. I was trying to use Johnsnowlabs NLP library in Zeppelin notebook (spark…
1
vote
1 answer

Using pretrained models from sparknlp on Databricks

I am trying to follow the official examples from John Snow Labs but every time I get a TypeError: 'JavaPackage' object is not callable error. I followed all of the steps in the Databricks install documentation but no matter what walkthrough I try,…
Frank B.
  • 1,813
  • 5
  • 24
  • 44
1
vote
1 answer

Spark equivalent to Keras Tokenizer?

So far, I pre-process text data using numpy and build-in fuctions (such as keras tokenizer class, tf.keras.preprocessing.text.Tokenizer: https://keras.io/api/preprocessing/text/). And there is were I got stuck: Since I am trying to scale up my model…
1
vote
1 answer

Can't get Spark NLP working on Databricks

I've done the following: import pyspark from pyspark.sql import SparkSession from pyspark import SparkContext, SparkConf, SQLContext spark = SparkSession \ .builder \ .appName('Amazon ETL') \ .config('spark.jars.packages',…
1
vote
0 answers

How to annotate a textFile using sparknlp?

I am using Sparknlp to annotate a long text file in databrick. My code is like this: import com.johnsnowlabs.nlp.base._ import com.johnsnowlabs.nlp.annotator._ val lines = sc.textFile("/FileStore/tables/48320_0-3f0d3.txt") import…
Qiang Yao
  • 165
  • 12
1
vote
1 answer

How do we extract named entities in scala using any nlp library

I have a huge text file and I have to extract only named entites in from this file. I am using Scala language and Databricks cluster for this. val input = sc.textFile('....Mypath...').flatMap(line => line.split("""\W+""")) val namedEnt =…
1
vote
1 answer

Persist BERT model on disk as pickle file

I have managed to get the BERT model to work on johnsnowlabs-spark-nlp library. I am able to save the "trained model" on disk as follows. Fit Model df_bert_trained = bert_pipeline.fit(textRDD) df_bert=df_bert_trained.transform(textRDD) save…
user8291021
  • 326
  • 2
  • 9