I am trying to read an RDBMS table on Greenplum database using spark. I have the following columns:
val allColumnsSeq: Seq[String] = Seq("usd_exchange_rate", "usd_exchange_rate::character varying as usd_exchange_rate_text")
I am trying to read above columns in spark as:
val yearDF = spark.read.format("io.pivotal.greenplum.spark.GreenplumRelationProvider").option("url", connectionUrl)
.option("dbtable", "x_lines")
.option("dbschema","copydb")
.option("user", devUserName).option("password", devPassword)
.option("partitionColumn","id")
.load()
.where("year=2017 and month=12")
.select(allColumnsSeq map col:_*)
.withColumn(flagCol, lit(0))
There are certain columns in gp that are of datatype: decimal which contain precision digits. In above table, it is:
usd_exchange_rate
It contains nearly 45 digits of precision. In our project is, we keep the original column(usd_exchange_rate) and we create a new column from usd_exchange_rate in character datatype & its column name appended with _text
. In this case,
decimal datatype: usd_exchange_rate
& same column in char datatype: usd_exchange_rate_text
When I execute the above line, I get the exception:
org.apache.spark.sql.AnalysisException: cannot resolve '`usd_exchange_rate::character varying as usd_exchange_rate_text`'
I see that that I am casting it in a wrong format but I don't understand how can I read the same column in decimal & text format in one step. Could anyone let me know if there is a way to achieve it in spark ?