1

I'm trying to apply an optional transformation (parseRow) to a Dataset with flatMap, so that Nones are discarded and Somes are unwrapped.

import org.apache.spark.sql.SparkSession

object Main
{
    def parseRow(line: String): Option[Map[String, String]] = Some(Map())

    def main(args: Array[String])
    {
        val ss = SparkSession.builder.getOrCreate()

        ss.read.textFile("somewhere")
          .flatMap(parseRow _)
          .write.parquet("somewhere/else")
    }
}

Unfortunately, I got the following error:

[error] Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
[error]           .flatMap(parseRow _)
[error]                   ^

I tried the fix suggested in the error, i.e. I added import ss.implicits._, but it didn't change anything.

I think I might be in a situation similar to the one described here (i.e. for some reason the implicit conversion from Option to TraversableOnce is not being applied), but I'm not good enough in Scala to find a solution for my case.

I tried to change that line to .flatMap(line => parseRow(line)) or .flatMap(parseRow), but it didn't work.

lodo
  • 2,314
  • 19
  • 31

0 Answers0