Can anyone explain to me why I am getting different results for these 2 expressions ? I am trying to filter between 2 dates:
df.filter("act_date <='2017-04-01'" and "act_date >='2016-10-01'")\
.select("col1","col2").distinct().count()
Result : 37M
vs
df.filter("act_date <='2017-04-01'").filter("act_date >='2016-10-01'")\
.select("col1","col2").distinct().count()
Result: 25M
How are they different ? It seems to me like they should produce the same result