1

So far, I have been able to Connect and run queries over Mysql database using spark SQL using Mysql JDBC driver registering it as spark Data frame.

Is it possible to connect to TeraData from Spark SQL and run queries over it?

Prem Singh Bist
  • 1,273
  • 5
  • 22
  • 37

4 Answers4

2

Question : Is it possible to connect to TeraData from Spark SQL and run queries over it?

Yes its possible.

create a dataframe like below example and run spark sql on top of that.

Below is the way for spark jdbc

val jdbcDF = sqlContext.load("jdbc", Map(
  "url" -> "jdbc:teradata://<server_name>, TMODE=TERA, user=my_user, password=*****",
  "dbtable" -> "schema.table_name", // here also can be select query
  "driver" -> "com.teradata.jdbc.TeraDriver"))
user3190018
  • 890
  • 13
  • 26
Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
1

yes it is possible!

load the class driver specific to teradata

val sqlcontext=new org.apache.spark.sql.SQLContext(sc)

Df_name=sqlcontext.load("JDBC",Map("url->uri to teradata","dbtable->name"))

register it as temp table and query over it

Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
user1234
  • 41
  • 3
1
val sqlcontext=new org.apache.spark.sql.SQLContext(sc)
val jddf = sqlcontext.load("jdbc", 
    Map("url" -> "jdbc:teradata://servername/, 
    TMODE=TERA, 
    user=####, 
    password=####, 
    LOGMECH=LDAP", 
    "dbtable" -> "(select count(column-name) as cnt from schemaname.table) AS ST", "driver" -> "com.teradata.jdbc.TeraDriver")
)
pczeus
  • 7,709
  • 4
  • 36
  • 51
sftengg
  • 11
  • 2
1

Make sure you add the jar to your class path and include it when you run the application.

sc.addJar("yourDriver.jar")
 
val jdbcDF = sqlContext.load("jdbc", Map(
  "url" -> "jdbc:teradata://<server_name>, TMODE=TERA, user=my_user, password=*****",
  "dbtable" -> "schema.table_name",
  "driver" -> "com.teradata.jdbc.TeraDriver"))

also refer the below link.

https://community.hortonworks.com/questions/63826/hi-is-there-any-connector-for-teradata-to-sparkwe.html

Rahul
  • 717
  • 9
  • 16