0

I will appreciated your response to below query.

I created few tables in vora (e.g test, addresses). I was able see the list of these tables in SHOW DATASOUCE and query them. Later I restarted vora instance and re-logged in as vora user and started vora spark shell. I am aware that I won't see this table in new shell as it won't present in new spark context. However I come across some link where it says <ClusterUtils.markAllHostsAsFailed()> will load all table in vora spark context from metadata but despite of below series of command execution

scala> import org.apache.spark.sql._
import org.apache.spark.sql._

scala> val SapSqlSc = new SapSQLContext(sc)

scala> import com.sap.spark.vora.client
import com.sap.spark.vora.client

scala> client.ClusterUtils.markAllHostsAsFailed()

scala> SapSqlSc.sql(s"""
     | SHOW DATASOURCETABLES
     | USING com.sap.spark.vora
     | OPTIONS
     | (
     | zkUrls "ip-x-x-x-1.ec2.internal:2181,ip-x-x-x-2.ec2.internal:2181",
     | namenodeurl "ip-x-x-x-1.ec2.internal:8020"
     | )
     | """.stripMargin).collect

I got below error and exception

16/03/04 11:56:24 ERROR Datastore.Schema: Failed initialising database.
Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@3d3efa54, see the next exception for details.
:
:
Caused by: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader $$anon$1@3d3efa54, see the next exception for details.at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
Manos Nikolaidis
  • 21,608
  • 12
  • 74
  • 82

1 Answers1

0

The error is typically caused by multiple instances of SapSqlContext in the same spark-shell. I suspect that further down in your error messages you see this error:

Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /<path_to_metastore_db>/metastore_db.

Background: In Vora1.0 the SapSqlContext is based on the HiveContext - which can only be instantiated once in a spark-shell. From Vora1.1 the SapSqlContext is based on SQLContext and the issue should not be observed any longer when using multiple SapSqlContext in one session.

There is however another issue: The command 'client.ClusterUtils.markAllHostsAsFailed()' is wrongly included in the documentation as the feature is not yet enabled. We will remove it from the documentation and add it back once the feature is enabled. Until then, if your Vora engines (v2servers) have been restarted, you need to recreate the tables using CREATE TABLE statements.

Frank Legler
  • 716
  • 1
  • 4
  • 10
  • you are right. My putty session were closed and when i re-logged in and launched new Vora shell and created same SAP SQL context variable i got above issue. Does it mean that despite of having metadata stored in zookeeper catalog, I cannot load table in to spark shell. Do i need to cleaned zookeeper catalog using command and then recreate same table again. if that is the case i need to recreate these table every time i down and bring back the vora instance. Please correct my understanding. – Shari M Mar 08 '16 at 12:18
  • Up to Vora 1.1 patch 1: If you have not restarted the Vora engines you can use the 'REGISTER TABLES' command to register the existing tables in the current Spark session. If you have restarted the Vora engines you need to re-create the tables. Also see http://stackoverflow.com/questions/34784246/vora-tables-in-zeppelin-and-spark-shell. With Vora1.2 we plan to provide a way to re-load tables after a Vora-engine restart - similar to the markAllHostsAsFailed above. – Frank Legler Mar 09 '16 at 00:13
  • Thanks Frank for valuable info. I need one more info regarding vora functionality. Per my knowledge Vora 1.0 currently support parent-child hierarchy and related built in UDFS. Does any version of vora support level hierarchy as well? – Shari M Mar 17 '16 at 15:29