5

Is there a built-in function to add a new column which is the negation of the original column?

Spark SQL has the function negative(). Pyspark does not seem to have inherited this function.

df_new = df.withColumn(negative("orginal"))
fermi
  • 197
  • 3
  • 8
  • I think you are asking: ``` df_new = df.withColumn("new_column_name", [what put in here?]) ``` – Tim C. May 24 '23 at 23:08

2 Answers2

9

Assuming your column original is boolean :

df_new = df.withColumn(~df["original"])  # Equivalent to "not original"
Pierre Gourseaud
  • 2,347
  • 13
  • 24
  • Thanks Pierre, it looks like the '~' operator only works on Boolean Types. This operator is handy though – fermi Aug 21 '19 at 19:44
  • Is there something pyspark native to do this? I think if I use ```~``` then it is going to be executed in a python process using memory overhead, not in the JVM. Is there an alternative that will be executed within the JVM only? – figs_and_nuts Jan 21 '22 at 10:00
0

I think it should be this to be syntax right, based on @pierre-gourseaud's answer:

df_new = df.withColumn("new_column_name", ~df["original"])  # Equivalent to "not original"

Tim C.
  • 91
  • 6