1

I have a pypark.sql Dataframe which was created using an inner join of two data frames. I have also created one column after joining which provides week_start date based on the date.

Joined_data=Joined_data.withColumn("Week_start_date",date_sub(next_day('AsOfDate','Sun'),7))

Now, when I want to create a list(collection) of all week. I am using the below code.

DateList=Joined_data.select('Week_start_date').dropDuplicates()

I am getting the Error: "Using PythonUDF in join condition of join type LeftSemi is not supported."

If I remove dropDuplicates() method from the above line it runs fine without any error.

Does anyone have any idea why I am getting this error with dropDuplicates() method?

ASD
  • 25
  • 6

0 Answers0