On Pandas, I can do the following code:
contract['PREV_END'] = contract.groupby('SUBSCR_NO').END.shift(1)
But using Pandas on Spark API, I get this error:
AnalysisException: cannot resolve 'isnan(lag(CON_END, 1, NULL) OVER (PARTITION BY SUBSCR_NO ORDER BY natural_order ASC NULLS FIRST ROWS BETWEEN -1 FOLLOWING AND -1 FOLLOWING))' due to data type mismatch: argument 1 requires (double or float) type, however, 'lag(CON_END, 1, NULL) OVER (PARTITION BY SUBSCR_NO ORDER BY natural_order ASC NULLS FIRST ROWS BETWEEN -1 FOLLOWING AND -1 FOLLOWING)' is of date type.;
I refer to the following documentation: https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.groupby.GroupBy.shift.html?highlight=shift#pyspark.pandas.groupby.GroupBy.shift
It doesn't state that Dates are not supported, but it also doesn't state what is supported.
What can I do to achieve this shift in Pandas on Spark API?