-2

I am trying the following simple transformation.

data = [["06/15/2020 14:04:04]]
cols = ["date"]

df = spark.createDataFrame(data,cols)

df = df.withColumn("datetime",F.to_timestamp(F.col("date"),'MM/DD/YYYY HH24:MI:SS))
df.show()

But this gives me an error "All week-based patterns are unsupported since Spark 3.0, detected: Y, Please use the SQL function EXTRACT instead"

I want to format the data into that date format and convert it to timestamp.

  • What is `HH24:MI:` for? – Lamanus Feb 09 '23 at 08:13
  • I want the date to be in 24 hours format. Let me know the correct way if this is wrong way of defining. – Amaravathi Satya Feb 09 '23 at 08:17
  • Does this answer your question? [Better way to convert a string field into timestamp in Spark](https://stackoverflow.com/questions/29844144/better-way-to-convert-a-string-field-into-timestamp-in-spark) – Lamanus Feb 09 '23 at 08:59

2 Answers2

0

You should use this format - MM/dd/yyyy HH:mm:ss'

Check this page for all datetime format related details.

df = df.withColumn("datetime",to_timestamp(col("date"),'MM/dd/yyyy HH:mm:ss'))
df.show()

+-------------------+-------------------+
|               date|           datetime|
+-------------------+-------------------+
|06/15/2020 14:04:04|2020-06-15 14:04:04|
+-------------------+-------------------+
Mohana B C
  • 5,021
  • 1
  • 9
  • 28
0

The different elements of the timestamp pattern are explained in Spark's documentation. Note that Spark parses timestamps utilising Java's SimpleTimeFormat which uses a somewhat confusing set of format symbols. The symbol matching the hour in 24hrs representation is simply H with no digital suffixes. The minutes are m and not M which is for the month. The year is matched by y and not by Y which is for week year. Week years are unsupported hence the message you're getting.

In your case, the proper format should be MM/dd/yyyy HH:mm:ss.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186