Use DataFrameNaFunctions
DataFrame fill(double value) Returns a new DataFrame that replaces
null values in numeric columns with value.
DataFrame fill(double
value, scala.collection.Seq cols) (Scala-specific) Returns a
new DataFrame that replaces null values in specified numeric columns.
Example Usage :
df.na.fill(0.0,Seq("your columnname"))
for that column null values will be replaced with 0.0 or any default value.
replace
is also useful for replacing empty strings with default values
replace public DataFrame replace(String col,
java.util.Map replacement) Replaces values matching keys in replacement map with the corresponding values. Key
and value of replacement map must have the same type, and can only be
doubles or strings. If col is "*", then the replacement is applied on
all string columns or numeric columns.
import com.google.common.collect.ImmutableMap;
// Replaces all occurrences of 1.0 with 2.0 in column "height".
df.replace("height", ImmutableMap.of(1.0, 2.0));
// Replaces all occurrences of "UNKNOWN" with "unnamed" in column
"name". df.replace("name", ImmutableMap.of("UNKNOWN", "unnamed"));
// Replaces all occurrences of "UNKNOWN" with "unnamed" in all
string columns. df.replace("*", ImmutableMap.of("UNKNOWN",
"unnamed")); Parameters: col - name of the column to apply the value
replacement replacement - value replacement map, as explained above
Returns: (undocumented) Since:
1.3.1
for example :
df.na.replace("your column", Map(""-> 0.0)))