unsupportedOperationException Error converting string to DateTime using Joda time

Question

I am converting string to datetime field using joda.time.Datetime libraries but it throws unsupported exception Here is main class code:

//create new var with input data without header
var inputDataWithoutHeader: RDD[String] = dropHeader(inputFile)
var inputDF1 = inputDataWithoutHeader.map(_.split(",")).map{p =>
val dateYMD: DateTime = DateTimeFormat.forPattern("yyyy-MM-dd HH:mm:ss").parseDateTime(p(8))
testData(dateYMD)}.toDF().show()

p(8) is columnn with datatype datetime defined in class testData and CSV data for the column has value like 2013-02-17 00:00:00

Here is testData Class:

case class testData(StartDate: DateTime) { }

Here is the Error I get :

Exception in thread "main"

java.lang.UnsupportedOperationException: Schema for type org.joda.time.DateTime is not supported
    at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:153)
    at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:29)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:128)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:126)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:126)
    at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:29)
    at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:64)
    at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:29)
    at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:361)
    at org.apache.spark.sql.SQLImplicits.rddToDataFrameHolder(SQLImplicits.scala:47)
    at com.projs.poc.spark.ml.ProcessCSV$delayedInit$body.apply(ProcessCSV.scala:37)

zero323 · Accepted Answer · 2016-01-14T18:50:48.213

4

As you can read in the official documentation dates in Spark SQL are represented using java.sql.Timestamp. If you want to use Joda time you have to convert output to the correct type

SparkSQL can easily handle standard date formats using type casting:

sc.parallelize(Seq(Tuple1("2016-01-11 00:01:02")))
  .toDF("dt")
  .select($"dt".cast("timestamp"))

edited Jan 14 '16 at 18:50

answered Jan 14 '16 at 18:42

zero323

322,348
103
959
935

I would like to use joda time and would like to know where am I going wrong in the conversion for string to datetime – rk1113 Jan 14 '16 at 18:48
`org.joda.time.DateTime` != `java.sql.Timestamp` – zero323 Jan 14 '16 at 18:50
but where I m using java.sql.Timestamp as both in my main class and TestData Class I m using joda.time.DateTime ? – rk1113 Jan 14 '16 at 18:52
Yes you do, and it won't work. It __has to be__ `java.sql.Timestamp` . – zero323 Jan 14 '16 at 18:54

score 1 · Answer 2 · answered Jan 21 '16 at 01:02

Thanks zero323 for the solution. I used java.sql.Timestamp and here is the code I modified

val dateYMD: java.sql.Timestamp = new java.sql.Timestamp(DateTimeFormat.forPattern("yyyy-MM-dd HH:mm:ss").parseDateTime(p(8)).getMillis)
testData(dateYMD)}.toDF().show()

and changed my class to

case class testData(GamingDate: java.sql.Timestamp) { }

score 1 · Answer 3 · edited Nov 09 '20 at 12:56

Scala spark schema doesnot support datetime explicitly. You can explore other options. They are:

Convert datetime to millis and you can maintain in Long format .
Convert datetime to unixtime (java format) https://stackoverflow.com/a/44957376/9083843
Convert datetime to string. you can change back to joda datetime at any moment using DateTime.parse("stringdatetime")
If you still want to maintain in joda datetime in scala schema then you can convert your dataframe to sequence

dataframe.rdd.map(r =>DateTime.parse(r(0).toString())).collect().toSeq

unsupportedOperationException Error converting string to DateTime using Joda time

3 Answers3

Linked