I'm facing an issue with manipulating a WrappedArray
column. I want to remove/filter element from the WrappedArray
column in a Spark dataset.
The WrappedArray
contain objects, for example, I have a dataset contain following column:
ColA
-----
WrappedArray([id:111, type:A],[id:222,type:B])
WrappedArray([id:333, type:A],[id:444,type:C])
WrappedArray([id:555, type:B],[id:666,type:C])
I want to remove any element inside the WrappedArray
with type == A
. The desired output is like:
ColA
-----
WrappedArray([id:222,type:B])
WrappedArray([id:444,type:C])
WrappedArray([id:555, type:B],[id:666,type:C])
I was thinking about using an UDF
and withColumn
, and I can see that the WrappedArray
API has the filter
function, but can't get the syntax right.
Working on Java, but any language is okay. Any help/suggestion would be nice!