Questions tagged [apache-spark-encoders]
54 questions
6
votes
1 answer
How to join two spark dataset to one with java objects?
I have a little problem joining two datasets in spark, I have this:
SparkConf conf = new SparkConf()
.setAppName("MyFunnyApp")
.setMaster("local[*]");
SparkSession spark = SparkSession
.builder()
.config(conf)
…

viti
- 79
- 1
- 7
5
votes
1 answer
How to implement Functor[Dataset]
I am struggling on how to create an instance of Functor[Dataset]... the problem is that when you map from A to B the Encoder[B] must be in the implicit scope but I am not sure how to do it.
implicit val datasetFunctor: Functor[Dataset] = new…

Mikel San Vicente
- 3,831
- 2
- 21
- 39
5
votes
1 answer
spark implicit encoder not found in scope
I have a problem with spark already outlined in spark custom kryo encoder not providing schema for UDF but created a minimal sample now:
https://gist.github.com/geoHeil/dc9cfb8eca5c06fca01fc9fc03431b2f
class SomeOtherClass(foo: Int)
case class…

Georg Heiler
- 16,916
- 36
- 162
- 292
4
votes
1 answer
Spark Encoders: when to use beans()
I came across a memory management problem while using Spark's caching mechanism. I am currently utilizing Encoders with Kryo and was wondering if switching to beans would help me reduce the size of my cached dataset.
Basically, what are the pros and…

Hatak
- 53
- 1
- 6
4
votes
2 answers
How to create an Encoder for Scala collection (to implement custom Aggregator)?
Spark 2.3.0 with Scala 2.11. I'm implementing a custom Aggregator according to the docs here. The aggregator requires 3 types for input, buffer, and output.
My aggregator has to act upon all previous rows in the window so I declared it like…

Uncle Long Hair
- 2,719
- 3
- 23
- 33
4
votes
2 answers
How to create a Dataset of Maps?
I'm using Spark 2.2 and am running into troubles when attempting to call spark.createDataset on a Seq of Map.
Code and output from my Spark Shell session follow:
// createDataSet on Seq[T] where T = Int works
scala> spark.createDataset(Seq(1, 2,…

wkl
- 77,184
- 16
- 165
- 176
4
votes
1 answer
Generic T as Spark Dataset[T] constructor
In the following snippet, the tryParquet function tries to load a Dataset from a Parquet file if it exists. If not, it computes, persists and returns back the Dataset plan which was provided:
import scala.util.{Try, Success, Failure}
import…

Jivan
- 21,522
- 15
- 80
- 131
4
votes
1 answer
Spark: java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
I'm writing a Spark application using version 2.1.1. The following code got the error when calling a method with LocalDate parameter?
Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
-…

ca9163d9
- 27,283
- 64
- 210
- 413
4
votes
1 answer
Spark Error: Unable to find encoder for type stored in a Dataset
I am using Spark on a Zeppelin notebook, and groupByKey() does not seem to be working.
This code:
df.groupByKey(row => row.getLong(0))
.mapGroups((key, iterable) => println(key))
Gives me this error (presumably a compilation error, since it…

JackOrJones
- 304
- 1
- 6
- 15
3
votes
0 answers
DataType (UDT) v.s. Encoder in Spark SQL
In Spark SQL, there're limited DataTypes for Schema, and there're limited Encoders for converting JVM objects to and from the internal Spark SQL representation.
In practice, we may have errors like this regarding DataType, which usually happens in…

jack
- 1,787
- 14
- 30
3
votes
1 answer
Impossible to operate on custom type after it is encoded? Spark Dataset
Say you have this (solution of encoding custom type is brought from this thread):
// assume we handle custom type
class MyObj(val i: Int, val j: String)
implicit val myObjEncoder = org.apache.spark.sql.Encoders.kryo[MyObj]
val ds =…

jack
- 1,787
- 14
- 30
3
votes
1 answer
Question regarding kryo and java encoders in datasets
I am using Spark 2.4 and referring to
https://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence
Bean class:
public class EmployeeBean implements Serializable {
private Long id;
private String name;
private Long…

Dev
- 13,492
- 19
- 81
- 174
3
votes
2 answers
How to make an Encoder for scala Iterable, spark dataset
I'm trying to create a Dataset from a RDD y
Pattern: y: RDD[(MyObj1, scala.Iterable[MyObj2])]
So I created explicitly encoder :
implicit def tuple2[A1, A2](
implicit e1: Encoder[A1],
e2:…

G.Saleh
- 509
- 1
- 11
- 29
3
votes
0 answers
spark custom kryo encoder not providing schema for UDF
When following along with How to store custom objects in Dataset? and trying to register my own kryo encoder for a data frame I face an issue of Schema for type com.esri.core.geometry.Envelope is not supported
There is a function which will parse a…

Georg Heiler
- 16,916
- 36
- 162
- 292
2
votes
1 answer
Is there an Encoder for Map type in Java Spark?
I am trying to create a custom Aggregator function producing a Map as the result, however it requires an Encoders. As referenced in
https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/Encoders.html, there isn't one for now.
Does anyone…

Nguyễn Hải Hà
- 31
- 3