0

I'm newbie in pyspark and spark-nlp and i want to use spark-nlp in docker container with GPU support on WSL-2 Windows 10. After installing spark-nlp I can use pretrained models and pipelines, but there is no difference between CPU and GPU speed. Nvidia-smi shows that model is loaded into GPU memory. Can you please tell me what versions of libraries i have to install or what kind of problem is this.

Thanks

1 Answers1

0

you have 2 options for setting up GPU on Spark-NLP according to how you're starting the session,

import sparknlp

spark = sparknlp.start(gpu=True)

or by passing this

spark = SparkSession.builder \
    .appName("Spark NLP")\
    .master("local[*]")\
    .config("spark.driver.memory","16G")\
    .config("spark.driver.maxResultSize", "0") \
    .config("spark.kryoserializer.buffer.max", "2000M")\
    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.2.2")\
    .getOrCreate()

Also the use of GPU will depend on which model you're using, and the dataset size. So don't expect an automatic speed up.

AlbertoAndreotti
  • 478
  • 4
  • 13