So far, I have been able to Connect and run queries over Mysql database using spark SQL using Mysql JDBC driver registering it as spark Data frame.
Is it possible to connect to TeraData from Spark SQL and run queries over it?
So far, I have been able to Connect and run queries over Mysql database using spark SQL using Mysql JDBC driver registering it as spark Data frame.
Is it possible to connect to TeraData from Spark SQL and run queries over it?
Question : Is it possible to connect to TeraData from Spark SQL and run queries over it?
Yes its possible.
create a dataframe like below example and run spark sql on top of that.
Below is the way for spark jdbc
val jdbcDF = sqlContext.load("jdbc", Map(
"url" -> "jdbc:teradata://<server_name>, TMODE=TERA, user=my_user, password=*****",
"dbtable" -> "schema.table_name", // here also can be select query
"driver" -> "com.teradata.jdbc.TeraDriver"))
yes it is possible!
load the class driver specific to teradata
val sqlcontext=new org.apache.spark.sql.SQLContext(sc)
Df_name=sqlcontext.load("JDBC",Map("url->uri to teradata","dbtable->name"))
register it as temp table and query over it
val sqlcontext=new org.apache.spark.sql.SQLContext(sc)
val jddf = sqlcontext.load("jdbc",
Map("url" -> "jdbc:teradata://servername/,
TMODE=TERA,
user=####,
password=####,
LOGMECH=LDAP",
"dbtable" -> "(select count(column-name) as cnt from schemaname.table) AS ST", "driver" -> "com.teradata.jdbc.TeraDriver")
)
Make sure you add the jar to your class path and include it when you run the application.
sc.addJar("yourDriver.jar")
val jdbcDF = sqlContext.load("jdbc", Map(
"url" -> "jdbc:teradata://<server_name>, TMODE=TERA, user=my_user, password=*****",
"dbtable" -> "schema.table_name",
"driver" -> "com.teradata.jdbc.TeraDriver"))
also refer the below link.