5

Let's say I have a column of Strings like this

Hour
0045
2322

And I want it to become like this:

Hour
00:45
23:22   

In order to after turn into a timestamp. How would I go about it?

BryceSoker
  • 624
  • 1
  • 11
  • 29

1 Answers1

11

You can use regexp_replace

from pyspark.sql.functions import col, regexp_replace

df.withColumn("Hour", regexp_replace(col("Hour") ,  "(\\d{2})(\\d{2})" , "$1:$2" ) ).show()

+-----+
| hour|
+-----+
|00:45|
|00:50|
+-----+
philantrovert
  • 9,904
  • 3
  • 37
  • 61