1

I have an Avro file containing a decimal logicalType as follow:

"type":["null",{"type":"bytes","logicalType":"decimal","precision":19,"scale":2}]


when I try to read the file with scala spark library the df schema is

MyField: binary (nullable = true)


How can I convert it into a decimal type?

Mauro Midolo
  • 1,841
  • 3
  • 14
  • 34

1 Answers1

1

You can specify schema in read operation:

val schema = new StructType()
    .add(StructField("MyField", BooleanType))

or you can cast column

val binToInt: String => Integer = Integer.ParseInt(_, 2);
val binToIntegerUdf = udf(binToInt);

df.withColumn("Myfield", binToIntegerUdf(col("MyField").cast("string")))
hamza tuna
  • 1,467
  • 1
  • 12
  • 17
  • the cast solution raise the following error: cannot resolve 'CAST(`MyField` AS DECIMAL(10,0))' due to data type mismatch: cannot cast binary to decimal(10,0); – Mauro Midolo Nov 16 '18 at 14:03
  • Updated. You can write your own function to do it and save it as udf. – hamza tuna Nov 16 '18 at 14:47
  • 1
    This solution does not work. A `binary` cannot be `cast`ed into `decimal`. Casting it into `string` converts the underlying `Array[Byte]` to `String`. It does not return the string representation of the decimal – Nicus Oct 16 '19 at 14:11