-1

I have a function like:

def createDataset[T](seq:Seq[T]): Dataset[T] = {
    import spark.implicits._
    seq.toDS()
}

And this is not compiling, it doesn't find toDS function.

It also doesn't work in this way

def createDataset[T](t:T): Dataset[T] = {
    import spark.implicits._
    Seq(t).toDS()
}

The case classes that I'm using are

case class Person(id: Long, name: String, age: Int) {}
case class Address(a_id:Long, street:String, number: Int) {}

What I can do to have a generic function that creates a Dataset given a T generic class which is always a case class?

edit:

Solution provided by Terry Dactyl is not working for me and show this error when f function is called

import org.apache.spark.sql.{Dataset, Encoder, SparkSession}

def f[T <: Product : Encoder](s: Seq[T]): Dataset[T] = {
   val spark = SparkSession.builder.getOrCreate()
   import spark.implicits._
   s.toDF.as[T]
}

f(Seq(
    Person(1, "John", 25)
    Person(2, "Paul", 22)
))

No implicits found for parameter ev$1: Encoder[Person]

Pau Trepat
  • 697
  • 1
  • 6
  • 24
  • 1
    You don't use the solution correctly. Creating session inside the function and bringing implicit `Encoder` in its closure won't bring `Encoder[T]` in the outer scope. Implicits have to be provided as in the answer (outside the closure). – zero323 Nov 09 '18 at 12:09

1 Answers1

1
import org.apache.spark.sql._
import spark.implicits._

def f[T <: Product : Encoder](s: Seq[T]): Dataset[T] = {
  s.toDF.as[T]
}
case class C(a: Int, b: Int)

f(Seq(C(1, 2), C(3, 4), C(5, 6)))

res0: org.apache.spark.sql.Dataset[C] = [a: int, b: int]
Terry Dactyl
  • 1,839
  • 12
  • 21