I am new to Scala and Spark.
I am trying to use encoder to read a file from Spark and then convert to a java/scala object.
The first step to read the file applying a schema and encoding using as works fine.
Then I use that dataset/dataframe to do a simple map operation, but if I try to print the schema on the resultant dataset/dataframe it doesn't print any columns.
Also, when i first read the file, i don't map age field in Person class, just to calculate it in the map function to try out - but I don't see that age not mapped to the data frame using Person at all.
Data in Person.txt:
firstName,lastName,dob
ABC, XYZ, 01/01/2019
CDE, FGH, 01/02/2020
The below is the code:
object EncoderExample extends App {
val sparkSession = SparkSession.builder().appName("EncoderExample").master("local").getOrCreate();
case class Person(firstName: String, lastName: String, dob: String,var age: Int = 10)
implicit val encoder = Encoders.bean[Person](classOf[Person])
val personDf = sparkSession.read.option("header","true").option("inferSchema","true").csv("Person.txt").as(encoder)
personDf.printSchema()
personDf.show()
val calAge = personDf.map(p => {
p.age = Year.now().getValue - p.dob.substring(6).toInt
println(p.age)
p
} )//.toDF()//.as(encoder)
print("*********Person DF Schema after age calculation: ")
calAge.printSchema()
//calAge.show
}