0

Using the following code in Spark(Java), we save dataframe in Oracle, it creates a table too if doesn't exists.

Dataset<Row> someAccountDF = sparkSession.createDataFrame(impalaAccountsDF.toJavaRDD(), AccountSchema.getSchema());
dataFrame.write().mode(saveMode).jdbc(connectionUrl, tableName, connectionParams);

Now it creates columns with double quotes like-

CREATE TABLE "SCHEMA"."ACCOUNT" 
   (    "primaryidentifier" VARCHAR2(255 BYTE), 
    "systemdata" VARCHAR2(255 BYTE), ......
)

So when I query
select primaryidentifier from account; //Doesn't work
but when I query
select "primaryidentifier" from account; //Works

but it creates issues in our code in mapping etc. and also doesn't look good having double quotes in columnname.

Out SchemaClass is like below. I would like to have something like Varchar2 instead of StringType:

public class AccountSchema {
    public StructType getSchema() {

        StructType schemaTyped = new StructType()
                .add("primaryidentifier", StringType)
                .add("systemdata", StringType)
.............}
}

I saw this question but unable to follow it (I'm know very little bit Scala)

abhihello123
  • 1,668
  • 1
  • 22
  • 38
  • Just copy/paster code from the answer you pointed out after your context initialization and before you write into database. Oracle is case sensitive, so the user unregistered the default JDBC oracle dialect and overwrote changing quoteIdentifier method to remove quotes. – gasparms Oct 15 '17 at 09:41
  • @gasparms, So it means I don't require any change in my AccountSchema. Right? Can you convert that code and give an answer in Java. I am new in Spark and don't know Scala so not able to do the conversion properly. – abhihello123 Oct 15 '17 at 09:48

0 Answers0