-2

I have a column ORDER_DATE with epoch timestamp in string. How can I convert this column with string like str = "1536309236032" which is time in epoch to a string with format: 2018-09-07T14:03:56.032Z in Scala?

Currently I am using:

from_unixtime(input.col(ORDER_DATE), "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")

but this converts incorrectly to 50668-08-21 01:10:00.000. Here, it increases the year and increments 000 for milliseconds.

I don't want to divide by 1000 as we would like to have the result in milli seconds.

Shaido
  • 27,497
  • 23
  • 70
  • 73
  • Possible duplicate [https://stackoverflow.com/questions/33475229/convert-epoch-to-datetime-in-scala-spark](https://stackoverflow.com/questions/33475229/convert-epoch-to-datetime-in-scala-spark) – darkmatter Sep 18 '18 at 13:25
  • I have made my question more clear and it is a different requirement from what link you mentioned. Can you help me in this? – Abhinav Kaushal Keshari Sep 18 '18 at 14:36
  • From the length of your input time string, it's probably in milliseconds. Try: `from_unixtime($"epochstr".cast("long")/1000, "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")` – Leo C Sep 18 '18 at 16:01
  • But I want the output in milliseconds as well. Is there a way to get that? – Abhinav Kaushal Keshari Sep 18 '18 at 17:18

2 Answers2

1

In the documentation, the definition of from_unixtime is as follows:

Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.

It uses seconds and are thus not compatible with milliseconds which is why the result is wrong. To convert the epoch timestamp and keeping the millisecond information, you can use concat:

val spark = SparkSession.builder.getOrCreate()
import spark.implicits._

df.withColumn("time", concat(
    from_unixtime($"ORDER_DATE"/1000, "yyyy-MM-dd'T'HH:mm:ss."), 
    $"ORDER_DATE".substr(length($"ORDER_DATE")-2, length($"ORDER_DATE")), 
    lit("Z")))

This will work since the last 3 digits in the epoch timestamp is the same as those in the wanted result.

Shaido
  • 27,497
  • 23
  • 70
  • 73
  • Can you also tell a way so that I perform this action only when the field in the column is not "0" otherwise it leaves the field empty? – Abhinav Kaushal Keshari Sep 19 '18 at 10:54
  • 1
    @AbhinavKaushalKeshari: You can use `when` and `otherwise`, see for example here: https://stackoverflow.com/questions/37064315/how-to-write-case-with-when-condition-in-spark-sql-using-scala – Shaido Sep 19 '18 at 11:00
0

I got the idea from @Shaido and I did something similar. Finally, this solved the issue for me:

input.withColumn("time",
concat(from_unixtime(input.col("ORDER_DATE")/1000, "yyyy-MM-dd'T'HH:mm:ss"), 
typedLit("."), substring(input.col("ORDER_DATE"), 11, 3), typedLit("Z")))