-1

While writing data into a csv from spark dataframe . I want to remove " quotes only from numeric data .

Actual Output:

+-------+---------+-----+
|user_id|course   |marks|
+-------+---------+-----+
|    "1"|    "eng"|  "9"|
|    "1"| "french"|  "7"|
+-------+---------+-- ---+

Expected Output

+-------+---------+-----+
|user_id|course   |marks|
+-------+---------+-----+
|      1|    "eng"|    9|
|      1| "french"|    7|
+-------+---------+-----+
Harsh
  • 27
  • 7
  • Could you post the CSV you used? When I'm trying on your data, I'm not getting the quotes around the numbers – Sparker0i Feb 17 '20 at 11:20

1 Answers1

0

In DF, cast the numerical column Data type to Integer Type,

import org.apache.spark.sql.types.IntegerType

df
.select(df("user_id").cast(IntegerType), df("course"), df("marks").cast(IntegerType))
.show()
Sivakumar
  • 1,711
  • 14
  • 18
  • thanks for replying. please note here that I have some columns in decimal some in int. I cannot cast the columns to int. – Harsh Feb 17 '20 at 11:30