I am using Sparknlp to annotate a long text file in databrick. My code is like this:
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
val lines = sc.textFile("/FileStore/tables/48320_0-3f0d3.txt")
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val result = PretrainedPipeline("explain_document_ml").annotate(lines)
But I got the error like this:
command-2722311848879511:1: error: overloaded method value annotate with alternatives:
(target: Array[String])Array[Map[String,Seq[String]]] <and>
(target: String)Map[String,Seq[String]]
cannot be applied to (org.apache.spark.rdd.RDD[String])
val result = PretrainedPipeline("explain_document_ml").annotate(lines)
Since annotate can take string or array as parameters, why can I use the text files as the parameter? How should I modify my code? Thanks!