Questions tagged [apache-spark-encoders]
54 questions
2
votes
1 answer
Column type inferred as binary with typed UDAF
I'm trying to implement a typed UDAF that returns a complex type. Somehow Spark cannot infer the type of a result column and makes it binary putting the serialized data there. Here's a minimal example that reproduces the problem
import…

synapse
- 5,588
- 6
- 35
- 65
2
votes
1 answer
Why doesn't dataset's foreach method require an encoder, but map does?
I have two datasets: Dataset[User] and Dataset[Book] where both User and Book are case classes. I join them like this:
val joinDS = ds1.join(ds2, "userid")
If I try to map over each element in joinDS, the compiler complains that an encoder is…

vaer-k
- 10,923
- 11
- 42
- 59
2
votes
1 answer
Apache Spark 2.1 : java.lang.UnsupportedOperationException: No Encoder found for scala.collection.immutable.Set[String]
I am using Spark 2.1.1 with Scala 2.11.6. I am getting the following error. I am not using any case classes.
java.lang.UnsupportedOperationException: No Encoder found for scala.collection.immutable.Set[String]
field (class:…

user238607
- 1,580
- 3
- 13
- 18
2
votes
1 answer
How to pass Encoder as parameter to dataframe's as method
I want to convert dataFrame to dataSet by using different case class.
Now, my code is like below.
case Class Views(views: Double)
case Class Clicks(clicks: Double)
def convertViewsDFtoDS(df: DataFrame){
df.as[Views]
}
def…

Lynn
- 21
- 1
1
vote
1 answer
org.apache.spark.SparkRuntimeException: Only expression encoders are supported for now
I'm working with generics and encoders with Spark Datasets. And facing the above error with code that looks like. Please ignore the semantics of the code, Just posting a replicated simplified usecase.
The spark version I'm using is 3.2.1.
And scala…

Hemanth Gowda
- 604
- 4
- 16
1
vote
1 answer
Failing to convert a dataframe to a dataset of object with an enumeration custom field
I am facing an issue when trying to convert a dataframe to a dataset of objects with a custom field.
In this code, I have a dataframe with two columns, country, and currency. I want to convert this into a dataset using the MyObj case class where the…

Bilal Ennouali
- 319
- 2
- 5
- 15
1
vote
1 answer
Union generic type without Either Scala
This works fine:
def echo[A, B](a: A, b: B): A = ???
This is also fine:
def echo[A, B](a: A, b: B): B = ???
However, how do we achieve this to return either type A or B?
// error
def echo[A, B](a: A, b: B): A|B = ???
Is it simply possible to…

jack
- 1,787
- 14
- 30
1
vote
1 answer
How to covert a Dataframe to a Dataset,having a object reference of the parent class as a composition inside another class?
I am trying to convert a Dataframe to a Dataset, and the java classes structure is as follows:
class A:
public class A {
private int a;
public int getA() {
return a;
}
public void setA(int a) {
this.a = a;
…

Chirag
- 211
- 4
- 16
1
vote
2 answers
Add ADT column in Spark dataset?
I want to create a dataset which contains an ADT column. Based on this question: Encode an ADT / sealed trait hierarchy into Spark DataSet column
I know, that there's a solution which encodes with kryo, but that is not really helpful.
Theres's…

sanyi14ka
- 809
- 9
- 14
1
vote
0 answers
Encoders for collections in apache spark
Is there a way to create Encoders for collection type in apache spark. I tried below approach but it does not work.
import java.io.Serializable;
public class CollectionEntity implements Serializable {
private T collectionData;
public…

wandermonk
- 6,856
- 6
- 43
- 93
1
vote
2 answers
How is using encoders so much faster than java serialization?
How is using encoders so much faster than java and kryo serialization?

Hemanth Gowda
- 604
- 4
- 16
0
votes
2 answers
Using java spark encoder bean to create a typed subset of a Dataset|
I read a parquet file and I get a Dataset containing 57 columns.
Dataset ds = spark.read().parquet(locations);
I would like to use a custom type instead of Row. I have defined a java bean such as
import lombok.Getter;
import…

Calimero
- 161
- 1
- 8
0
votes
0 answers
Do Spark encoders respect Java's rules of inheritance?
My understanding: If I have a model class that extends a second model class, I shouldn't be able to access the private members of the parent class in the child class (unless I use reflection).
Extending this, I expect that when a Spark dataframe is…

Polyphonic Mobius
- 16
- 3
0
votes
1 answer
spark sql encoder for immutable data type
I've generally used immutable value types when writing java code. Sometimes it's been through libraries (Immutables, AutoValue, Lombok), but mostly just vanilla java classes with:
all final fields
a constructor with all fields as parameters
(This…

drobert
- 1,230
- 8
- 21
0
votes
1 answer
Spark Dataframe - Encoder
I am new to Scala and Spark.
I am trying to use encoder to read a file from Spark and then convert to a java/scala object.
The first step to read the file applying a schema and encoding using as works fine.
Then I use that dataset/dataframe to do a…

Sanjeev
- 119
- 4
- 18