I have two datasets: Dataset[User]
and Dataset[Book]
where both User
and Book
are case classes. I join them like this:
val joinDS = ds1.join(ds2, "userid")
If I try to map
over each element in joinDS
, the compiler complains that an encoder is missing:
not enough arguments for method map: (implicit evidence$46: org.apache.spark.sql.Encoder[Unit])org.apache.spark.sql.Dataset[Unit].
Unspecified value parameter evidence$46.
Unable to find encoder for type stored in a Dataset.
But the same error does not occur if I use foreach
instead of map
. Why doesn't foreach
require an encoder as well? I have imported all implicits from the spark session already, so why does map
require an encoder at all, when the dataset is a result of joining two datasets containing case classes)? Also, what type of dataset do I get from that join? Is it a Dataset[Row]
, or something else?