I have a group of Excel sheets, that I am trying to read via spark through com.crealytics.spark.excel package. In my excel sheet I have a column Survey ID that contains integer IDs. When I read the data through spark I see the values are converted to double value.
How can I retain the format of the integer values while reading from excel sheet ?
This is what I tried :
val df = spark.read.format("com.crealytics.spark.excel")
.option("location", <somelocation>)
.option("useHeader", "true")
.option("treatEmptyValuesAsNulls", "true")
.option("inferSchema", "true")
.option("addColorColumns","False")
.load()
Actual Value
Value read via Spark
+-----------+
| Survey ID|
+-----------+
|1.7632889E7|
|1.7632889E7|
|1.7632934E7|
|1.7633233E7|
|1.7633534E7|
|1.7655812E7|
|1.7656079E7|
|1.7930478E7|
|1.7944498E7|
|1.8071246E7|
If I cast the column to integer I get the required formatted data. But is there a better way to do this?
val finalDf=df.withColumn("Survey ID", col("Survey ID").cast(sql.types.IntegerType))