How to use GlueMetaStore with spark.sql in JupyterHub

Question

I want to use the GlueMetaStore with spark.sql.

For that I configured the EMR-Cluster (5.16) and set the following configurations:

{
    "Classification":"hive-site",
    "ConfigurationProperties":
    {
        "hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
    },
    "Configurations":[]
},
{
    "Classification":"spark-hive-site",
    "ConfigurationProperties":
    {
            "hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
    },
    "Configurations":[]
}

I used the spark-core to query the databases in JupyterHub, but I just get the default database which is empty. There should be a lot of more database when it works.

Do I need to enbaleHiveSupport or something simular to get the connection working, if yes how can I set it in JupterHub, because the context is already loaded?

score 0 · Answer 1 · answered Sep 12 '18 at 07:37

0

I found the solution

I need to edit "/etc/livy/conf.dist/livy.conf" on the masternode and add

livy.repl.enableHiveContext = true

to it.

restart livy-server with:

sudo stop livy-server
sudo startlivy-server

restart kernel and it works!

answered Sep 12 '18 at 07:37

mad

1
3

How to use GlueMetaStore with spark.sql in JupyterHub

1 Answers1