I am trying to replace a full stop in my raw data with the value 0 in PySpark.
- I tried to use a .when and .otherwise statement.
- I tried to use regexp_replace to change the '.' to 0.
Code tried:
from pyspark.sql import functions as F
#For #1 above:
dataframe2 = dataframe1.withColumn("test_col", F.when(((F.col("test_col") == F.lit(".")), 0).otherwise(F.col("test_col")))
#For #2 above:
dataframe2 = dataframe1.withColumn('test_col', F.regexp_replace(dataframe1.test_col, '.', 0))
Instead of "." it should rewrite the column with numbers only (i.e. there is a number in non full stop rows, otherwise, it's a full stop that should be replaced with 0).