The code col('col1')
returns the pyspark.sql.Column
in your DataFrame with the name "col1"
.
You are getting the error:
TypeError: 'Column' object is not callable
because you are trying to call split
(and trim
) as methods on this column, but no such methods exist.
Instead you want to call the functions pyspark.sql.functions.split()
and pyspark.sql.functions.trim()
with the Column
passed in as an argument.
For instance:
df1 = df.withColumn(
"newcol",
f.trim(
f.split(f.col('col1'), r"\+")[1]
)
)
df1.show(truncate=False)
#+-----------------------------------------------+----------------------+
#|col1 |newcol |
#+-----------------------------------------------+----------------------+
#|10/35/70/25% T4Max-300 + 20/45/80/25% T4Max-400|20/45/80/25% T4Max-400|
#+-----------------------------------------------+----------------------+
The second argument to split()
is treated as a regular expression pattern, so the +
has to be escaped.