6

I have an Enumeratum enum and need to load it into a spark data frame. Obviously, this fails due to a missing encoder.

import enumeratum._
sealed trait Foo extends EnumEntry

object Foo extends Enum[Foo] {

  val values = findValues

  case object Baz extends Foo
  case object Bar extends Foo
}
case class FooBar(a:Int, lotOfOthterFields:String, xxxx:Seq[Foo])
Seq(FooBar(1, "one", Foo.Baz), FooBar(2, "two", Foo.Bar)).toDF

Fails with No Encoder found for type Foo. How can I project the case class (without boilerplate) to:

  • either have it work fine in spark (I do not) want to have binary kryo output
  • or convert it to a string via Foo.Baz.entryName(but without boile of defining a similar class) something along the lines of Seq(FooBar(1, "one", Foo.Baz), FooBar(2, "two", Foo.Bar)).map(allValluesButxxxx, xxxx.entryName)
Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
  • So far I only could resort to applying the Enum`s entryName (String) but this is suboptimal. – Georg Heiler Jun 11 '19 at 07:42
  • 1
    I'd love to figure this out too; this doesn't let you use enumeratum, but at least serializes correctly: http://monkeythinkmonkeycode.com/eums-in-spark-datasets/ – Michael K Feb 14 '20 at 16:49

0 Answers0