You can read the data into a Dataset[MyCaseClass] in the following two ways:
Say you have the following class: case class MyCaseClass
1) First way: Import sparksession implicits in the scope and use the as operator to convert your DataFrame to Dataset[MyCaseClass]:
case class MyCaseClass
val spark: SparkSession = SparkSession.builder.enableHiveSupport.getOrCreate()
import spark.implicits._
val ds: Dataset[MyCaseClass]= spark.read.format("FORMAT_HERE").load().as[MyCaseClass]
2) You can create you own encoder in another object and import them in your current code
package com.funky.package
import org.apache.spark.sql.{Encoder, Encoders}
case class MyCaseClass
object MyCustomEncoders{
implicit val mycaseClass:Encoder[MyCaseClass] = Encoders.product[MyCaseClass]
}
In the file containing the main method, import the above implicit value
import com.funky.package.MyCustomEncoders
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.Dataset
val spark: SparkSession = SparkSession.builder.enableHiveSupport.getOrCreate()
val ds: Dataset[MyCaseClass]= spark.read.format("FORMAT_HERE").load().as[MyCaseClass]