Transform a dataframe to a dataset using case class spark scala

Question

I wrote the following code which aims to transform a dataframe to a dataset using a case class

def toDs[T](df: DataFrame): Dataset[T] = {
    df.as[T]
  }

then case class DATA( name:String, age:Double, location:String)

I am getting:

Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
[error]     df.as[T]

Any idea how to fix this

Possible duplicate of [How to pass Encoder as parameter to dataframe's as method](https://stackoverflow.com/questions/40692691/how-to-pass-encoder-as-parameter-to-dataframes-as-method) — user10938362, Jun 19 '19 at 10:01

score 1 · Answer 1 · answered Jun 15 '20 at 19:44

You can read the data into a Dataset[MyCaseClass] in the following two ways:

Say you have the following class: case class MyCaseClass

1) First way: Import sparksession implicits in the scope and use the as operator to convert your DataFrame to Dataset[MyCaseClass]:

case class MyCaseClass

val spark: SparkSession = SparkSession.builder.enableHiveSupport.getOrCreate()

import spark.implicits._

val ds: Dataset[MyCaseClass]= spark.read.format("FORMAT_HERE").load().as[MyCaseClass]

2) You can create you own encoder in another object and import them in your current code

package com.funky.package

import org.apache.spark.sql.{Encoder, Encoders}

case class MyCaseClass

object MyCustomEncoders{

 implicit val mycaseClass:Encoder[MyCaseClass] = Encoders.product[MyCaseClass]

}

In the file containing the main method, import the above implicit value

import com.funky.package.MyCustomEncoders
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.Dataset

val spark: SparkSession = SparkSession.builder.enableHiveSupport.getOrCreate()   

val ds: Dataset[MyCaseClass]= spark.read.format("FORMAT_HERE").load().as[MyCaseClass]

Transform a dataframe to a dataset using case class spark scala

1 Answers1

In the file containing the main method, import the above implicit value