1

I have a Spark dataframe with decimal column. I want to convert this column to a binary string. Are there any function for this can anybody help?

Thank you!

Beril Boga
  • 97
  • 2
  • 9
  • I just added an answer to another thread for doing this (converting a value to a String of binary digits) which works for `Boolean`, `Byte`, `Short`, `Char`, `Int`, and `Long`. https://stackoverflow.com/a/54950845/501113 – chaotic3quilibrium Mar 01 '19 at 19:32

2 Answers2

7

There is a bin inbuilt function which states

An expression that returns the string representation of the binary value of the given long column. For example, bin("12") returns "1100".

So if you have a dataframe as

+-----+
|Value|
+-----+
|4    |
+-----+

root
 |-- Value: decimal(10,0) (nullable = true)

You can use bin function as

import org.apache.spark.sql.functions._
data.withColumn("Value_Binary", bin(col("Value")))

which should give you

+-----+------------+
|Value|Value_Binary|
+-----+------------+
|4    |100         |
+-----+------------+

root
 |-- Value: decimal(10,0) (nullable = true)
 |-- Binary_value: string (nullable = true)
Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97
0

I solved this issue with creating a user defined function.

val toBinStr: Int => String = _.toBinaryString

import org.apache.spark.sql.functions.udf
val toBinStrUDF = udf(toBinStr)

// Apply the UDF to change the source dataset
data.withColumn("Value_Binary", toBinStrUDF($"Value")).show
Beril Boga
  • 97
  • 2
  • 9
  • Yours only works with `Int`. I just added an answer to another thread for doing this (converting a value to a String of binary digits) which works for `Boolean`, `Byte`, `Short`, `Char`, `Int`, and `Long`. https://stackoverflow.com/a/54950845/501113 – chaotic3quilibrium Mar 01 '19 at 19:33