I want to make all values in an array column in my pyspark data frame negative without exploding (!). I tried this udf but it didn't work:
negative = func.udf(lambda x: x * -1, T.ArrayType(T.FloatType()))
cast_contracts = cast_contracts \
.withColumn('forecast_values', negative('forecast_values'))
Can someone help?
Example data frame:
df = sc..parallelize(
[Row(name='Joe', forecast_values=[1.0,2.0,3.0]),
Row(name='Mary', forecast_values=[4.0,7.1])]).toDF()
>>> df.show()
+----+---------------+
|name|forecast_values|
+----+---------------+
| Joe|[1.0, 2.0, 3.0]|
|Mary| [4.0, 7.1]|
+----+---------------+
Thanks