0

I am currently working with a spark dataframe having 7 columns and one column being date . In my case the date column is represented as Quarters i.e 2017Q1 , 2017Q2,2017Q3

How do i create 2 new columns as below

Date_String 
Replace 2017Q1 with string "January - March 2017" and
replace 2017Q2 with string "April -June 2017 

I tried the below command

df= df.withColumn("Date_String",where(col("quarter_date") == "2017Q4"),lit('October - December 2017'))

Can anyone please help me on the above

pault
  • 41,343
  • 15
  • 107
  • 149
EricA
  • 403
  • 2
  • 14
  • Try replacing the `where` with `when` (assuming you already have imported `pyspark.sql.functions.when`) – pault Aug 21 '18 at 13:35
  • Thanks @pault , Below is the solution to achieve it – EricA Aug 21 '18 at 13:55
  • df\ .withColumn('Date_String',when(df.quarter_date == "2017Q4",'October - December 2017')\ .when(df.quarter_date == "2017Q2",'April - June 2017')\ .when(df.quarter_date == "2017Q1",'January - March 2017')\ .when(df.quarter_date == "2017Q3","July -September 2017") .otherwise('other'))\ .show() – EricA Aug 21 '18 at 13:56

0 Answers0