I am using a data type called a Point(x: Double, y: Double). I am trying to using columns _c1 and _c2 as input to Point(), and then create a new column of Point values as follows
val toPoint = udf{(x: Double, y: Double) => Point(x,y)}
Then I call the function:
val point = data.withColumn("Point", toPoint(watned("c1"),wanted("c2")))
However, when I declare the udf I get the following error:
java.lang.UnsupportedOperationException: Schema for type com.vividsolutions.jts.geom.Point is not supported
at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:733)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$2.apply(ScalaReflection.scala:729)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$2.apply(ScalaReflection.scala:728)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:728)
at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:671)
at org.apache.spark.sql.functions$.udf(functions.scala:3084)
... 48 elided
I have properly imported this data type, and used it many times before. Now that I try to include it in the Schema of my udf it doesn't recognize it. What is the method to include types other than the standard Int, String, Array, etc...
I am using Spark 2.1.0 on Amazon EMR.
Here some related questions I've referenced: