return a Pandas Series inside of pandas_udf spark

Question

on Apache Spark I have a pandas_udf function that should return a pd.Series How can this be archived?

I tried:

@pandas_udf(ArrayType(LongType()), PandasUDFType.SCALAR_ITER) # Only works with spark 3.0
def udf(iterator):
  ...
  return pd.Series([1,2,3,4,5])

this gives the exception:

pyarrow.lib.ArrowNotImplementedError: NumPyConverter doesn't implement <list<item: int64>> conversion.

Can you share what you want to achieve with some example data ? — Psidom, Feb 29 '20 at 19:42

score -2 · Answer 1 · answered Mar 11 '20 at 19:53

-2

ok this was an error on my side. Schema type from pandas udf

answered Mar 11 '20 at 19:53

Jorge Machado

1 Answers1