I have a Spark DF that I've converted to a pySpark Pandas DF using df.to_pandas_on_spark()
.
I have some logic that rounds a column i.e:
df[Column_Name] = round(df[Income] - df[Fees],2)
but get the following error
TypeError: type Series doesn't define round method
After searching around it seems like historically (with Koalas) the fix was to use pySpark's SQL functions:
import pyspark.sql.functions as f
df[Column_Name] = f.round(df[Income] - df[Fees],2)
Is there no import pyspark.pandas
round
function equivalent???
Because when I use pyspark.sql.functions
round
I get the following error:
Invalid argument, not a string or column: 57 0.000000
Not sure why it gives me a "not a string" error, do people round strings? I would expect at the very least it outputted "not a float/int/number etc..."
All of this logic works fine when I'm using solely Pandas
(not to confuse this with pySpark.Pandas
).