0

I'm using GraphFrame in spark 2.0 and scala.

I need to remove double quote from columns that are in string type (out of many columns). I'm trying to do so using UDF as follow:

import org.apache.spark.sql.functions.udf

val removeDoubleQuotes = udf( (x:Any) =>
    x match{
      case s:String => s.replace("\"","")
      case other => other
    }
  )

And I get the following error since type Any is not supported in GraphFrame.

java.lang.UnsupportedOperationException: Schema for type Any is not supported

What is a workaround for that?

MehrdadAP
  • 417
  • 4
  • 11
  • 1
    Do your columns have mixed types? Why not just write it only for strings and apply it only to the string columns? – Joe K Jul 08 '17 at 01:08
  • @JoeK Because I have many columns and try to find a way rather than manually find string columns. – MehrdadAP Jul 09 '17 at 17:51

1 Answers1

0

I think you don't have a column with type Any and you can't return different datatype from udf. You need to have a single datatype return from udf.

If your column is String then you can create udf as

import org.apache.spark.sql.functions.udf

val removeDoubleQuotes = udf( (x:String) => s.replace("\"",""))
koiralo
  • 22,594
  • 6
  • 51
  • 72