you can use format_number
function as
import org.apache.spark.sql.functions.format_number
df.withColumn("NumberColumn", format_number($"NumberColumn", 5))
here 5 is the decimal places you want to show
As you can see in the link above that the format_number
functions returns a string column
format_number(Column x, int d)
Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places, and returns the result as a string column.
If your don't require ,
you can call regexp_replace
function which is defined as
regexp_replace(Column e, String pattern, String replacement)
Replace all substrings of the specified string value that match regexp with rep.
and use it as
import org.apache.spark.sql.functions.regexp_replace
df.withColumn("NumberColumn", regexp_replace(format_number($"NumberColumn", 5), ",", ""))
Thus comma (,
) should be removed for large numbers.