0

Im using the latest SJS version (master) and the application extends SparkHiveJob. In the runJob implementation, I have the following

val eDF1 = hive.applySchema(rowRDD1, schema)

I would like to persist eDF1 and tried the following

    val rdd_topersist = namedObjects.getOrElseCreate("cleanedDF1", {
        NamedDataFrame(eDF1, true, StorageLevel.MEMORY_ONLY)
       })

where the following compile errors occur

could not find implicit value for parameter persister: spark.jobserver.NamedObjectPersister[spark.jobserver.NamedDataFrame] 
not enough arguments for method getOrElseCreate: (implicit timeout:scala.concurrent.duration.FiniteDuration, implicit persister:spark.jobserver.NamedObjectPersister[spark.jobserver.NamedDataFrame])spark.jobserver.NamedDataFrame. Unspecified value parameter persister.

Obviously this is wrong, but I can't figure what is wrong. I'm fairly new to Scala.

Can someone help me understand this syntax from NamedObjectSupport?

def getOrElseCreate[O <: NamedObject](name: String, objGen: => O)
                                    (implicit timeout: FiniteDuration = defaultTimeout,
                                    persister: NamedObjectPersister[O]): O
user1384205
  • 1,231
  • 3
  • 20
  • 39

1 Answers1

0

I think you should define implicit persister. Looking at the test code, I see something like this

https://github.com/spark-jobserver/spark-jobserver/blob/ea34a8f3e3c90af27aa87a165934d5eb4ea94dee/job-server-extras/test/spark.jobserver/NamedObjectsSpec.scala#L20

noorul
  • 1,283
  • 1
  • 8
  • 18