1

I my Scala application I use a Spark plugin (spark-excel) for creating and writing Excel files with several new sheets via Apache POI.

import spark.implicits._

val df = Seq(
  ("2019-01-01 00:00:00", "7056589658"),
  ("2019-02-02 00:00:00", "7778965896")
).toDF("DATE_TIME", "PHONE_NUMBER")

df.show()

df.coalesce(1).write
  .format("com.crealytics.spark.excel")
  .option("dataAddress", "'Sheet'!!A1:Z1000000")
  .option("useHeader", "true")
  .option("dateFormat", "yy-mmm-d")
  .option("timestampFormat", "mm-dd-yyyy hh:mm:ss")
  .mode("append")
  .save("/path/filename.xlsx")

Question: Is there any way to change the style of the Excel file? For example I want to change the color of the header, also change the width of columns. What would you advise?

baitmbarek
  • 2,440
  • 4
  • 18
  • 26
Nurzhan Nogerbek
  • 4,806
  • 16
  • 87
  • 193
  • would pyspark be an option for you? If so you could collect the spark dataframe (assuming it is small data) and then use the xlwt library which allows you to set styles for each cell – Paul Sep 17 '19 at 20:05
  • @Paul unfortunately I can't use PySpark. As I said before in my case I have Scala application. Do you have any other ideas? – Nurzhan Nogerbek Sep 18 '19 at 03:38
  • Looking at the souce code of the writer function (https://github.com/crealytics/spark-excel/blob/master/src/main/scala/com/crealytics/spark/excel/ExcelFileSaver.scala) I don't any way to change styles in this library – Paul Sep 18 '19 at 07:35
  • Also in this issue https://github.com/crealytics/spark-excel/issues/124 they say there is no support for coloring cells – Paul Sep 18 '19 at 07:52
  • @Paul thank you for information! I found such [article](https://www.javatpoint.com/apache-poi-excel-cell-color) where author use java to solve the same task. I think I can use this approach in the code file Excel `File Saver.scala`. What do you think about it? – Nurzhan Nogerbek Sep 18 '19 at 08:38
  • Yes this should work, could be some work to do though. You'd also need to modify the DataLocator file in the source code – Paul Sep 18 '19 at 16:28
  • @Paul what exactly I need to change in `DataLocator.scala` file for your opinion? – Nurzhan Nogerbek Sep 19 '19 at 03:52
  • @Paul also I found that [autoSizeColumn](http://poi.apache.org/apidocs/dev/org/apache/poi/hssf/usermodel/HSSFSheet.html#autoSizeColumn-int-) method could expand the width of the column. I am little bit confused where exactly I need to set this method in your code library. I'd appreciate it if you could tell me. – Nurzhan Nogerbek Sep 19 '19 at 04:27
  • I also don't know the library and don't know where the formatting goes in. Looks like a work to me. I don't know how complex your application is, that why I asked if translating the whole App (or at least the part of writing to excel) to Python since that is very straight forward – Paul Sep 21 '19 at 09:49

0 Answers0