0

I was sharing my RDDs between jobs with type of CassandraRow but I'm now joining several RDDs together so a case class makes more sense.

I save my RDD as below & then retrieve it in a new job. This worked fine with type CassandraRow. CData is the same case class in both jobs.

runtime.namedObjects.update("rdd:session", NamedRDD(mergedRDD, forceComputation = false, storageLevel = StorageLevel.MEMORY_ONLY))
// val mergedRDD: RDD[CData]

val NamedRDD(dbDayRDD, _, _) = runtime.namedObjects.get[NamedRDD[CData]]("rdd:session").get

Promos Job Failed {

  "duration": "0.545 secs",

  "classPath": "spark.jobserver.Promos",

  "startTime": "2017-08-08T18:07:02.131Z",

  "context": "dailycontext",

  "result": {

    "message": "java.lang.ClassCastException: spark.jobserver.SessionNew$CData$3 cannot be cast to spark.jobserver.Promos$CData$3",


    "errorClass": "java.lang.Throwable",
ozzieisaacs
  • 833
  • 2
  • 11
  • 23

1 Answers1

0

Turns out that you cannot redeclare any case classes in a new file. The exact path has to match for them to be considered the same or it tries to cast it to itself which doesn't work. I just moved all my case class definitions to a new file and then imported that file into each job.

ozzieisaacs
  • 833
  • 2
  • 11
  • 23