Issue Converting sql code into Pyspark code

Question

I need to convert below sql code to pyspark data frame code not in spark.sql('code') code.

select * from table1 where 
(
case when clm1 in ('R','C', 'F') then l=1 
when clm1 in ('8','8-B') 
and (select coalesce(max (code),0) from table2 where clm2 = "XXX') = 0 then 1=2
else 1=2 end)

How to use case condition in where condition...

This shows some examples of using `case when` with pyspark: https://stackoverflow.com/questions/39982135/apache-spark-dealing-with-case-statements — igorkf, Jul 06 '20 at 12:03
can you please share an input and output dataset - how it is looks like? — dsk, Jul 07 '20 at 15:09

score 0 · Answer 1 · answered Jul 06 '20 at 11:35

0

Assuming above is the working query-

df.where("case when clm1 in ('R','C', 'F') then l=1 when clm1 in ('8','8-B') and (select coalesce(max (code),0) from table2 where clm2 = "XXX') = 0 then 1=2 else 1=2 end")

answered Jul 06 '20 at 11:35

Som

6,193
1
11
22

What about if the sql is as below : select * from table1 where ( case when clm1 in ('R','C', 'F') then l=1 when clm1 in ('8','8-B') and (select coalesce(max (code),0) from table2 where clm2 = "XXX') = 0 then table.clm3=table2.clm4 else 1=2 end) – Rishi Anand Jul 09 '20 at 11:13

Issue Converting sql code into Pyspark code

1 Answers1