0

Does anybody know why is this happening?

enter image description here

and when I filter it:

enter image description here

EDIT: This is how I added the last two columns. It seems to me that because I used pandas_udf to generate the last two columns, something goes crazy, whereas I can filter the first four columns without any trouble, which I constructed using plain udf.

@pandas_udf('string', PandasUDFType.SCALAR)
def blocking(ids,x,y):
....
return pd.Series(final)

df4 = df3.withColumn('blocking_index', \
blocking(df3.id,df3.ratepayer,df3.CharityName))
  • [Why you shouldn't upload pictures of code/data](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-on-so-when-asking-a-question). – pault Jul 30 '18 at 13:43
  • Datatypes are fine – SalvorHardin Jul 31 '18 at 11:00
  • There's no way for us to help you unless we can recreate your problem. Please read [how to create good reproducible apache spark dataframe examples](https://stackoverflow.com/questions/48427185/how-to-make-good-reproducible-apache-spark-dataframe-examples) and try to provide a [mcve]. – pault Jul 31 '18 at 14:13

0 Answers0