Questions tagged [hive-metastore]

Hive Metastore refers to the central repository of Apache Hive (the open source of data warehouse system built on top of Hadoop) metadata, which stores metadata for Hive databases, tables, partitions, user groups, roles grants, statistics in a relational database. Use this tag for questions related to the Apache Hive central repository.

Hive Metastore (HMS) refers to the central repository of Apache Hive (the open source data warehouse system built on top of Hadoop), which stores metadata for Hive databases, tables, partitions, user groups, roles grants, statistics in a relational database. Use this tag for questions related to the Apache Hive central schema repository.

243 questions
35
votes
18 answers

java.lang.RuntimeException:Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

I have configured my Hive as given on link: http://www.youtube.com/watch?v=Dqo1ahdBK_A, but I am getting the following error while creating a table in Hive. I am using hadoop-1.2.1 and hive-0.12.0. hive> create table employee(emp_id int,name…
Raju Sharma
  • 2,496
  • 3
  • 23
  • 41
17
votes
4 answers

Setup Standalone Hive Metastore Service For Presto and AWS S3

I'm working in an environment where I have an S3 service being used as a data lake, but not AWS Athena. I'm trying to setup Presto to be able to query the data in S3 and I know I need the define the data structure as Hive tables through the Hive…
mhaken
  • 1,075
  • 4
  • 14
  • 28
16
votes
2 answers

Hive service, HiveServer2 & MetaStore service?

I am trying to understand hive in terms of architecture, and I am referring to Tom White's book on Hadoop. I came across the following terms in regards to hive: Hive Services , hiveserver2 , metastore among others. Referring to below diagrams from…
CuriousMind
  • 8,301
  • 22
  • 65
  • 134
14
votes
0 answers

Issue with AWS Glue Data Catalog as Metastore for Spark SQL on EMR

I am having an AWS EMR cluster (v5.11.1) with Spark(v2.2.1) and trying to use AWS Glue Data Catalog as its metastore. As per guidelines provided in official AWS documentation (reference link below), I have followed the steps but I am facing some…
9
votes
1 answer

Spark and Hive in Hadoop 3: Difference between metastore.catalog.default and spark.sql.catalogImplementation

I'm working on a Hadoop cluster (HDP) with Hadoop 3. Spark and Hive are also installed. Since Spark and Hive catalogs are separated, it's a bit confusing sometimes, to know how and where to save data in a Spark application. I know, that the property…
D. Müller
  • 3,336
  • 4
  • 36
  • 84
9
votes
4 answers

How can I convince spark not to make an exchange when the join key is a super-set of the bucketBy key?

While testing for a production use-case I have created and saved (using Hive Metastore) such tables: table1: fields: key1, key2, value1 sortedBy key1,key2 bucketBy: key1, 100 buckets table2: fields: key1, key2, value2 sortedBy: key1,key2 bucketBy:…
zetaprime
  • 278
  • 2
  • 14
8
votes
1 answer

How to check if a partition exists in Hive?

I have a Hive table, which is partitioned by column dt. I need to add a partition if it does not exists, for exmaple, dt='20181219'. Now I'm using HiveMetaStoreClient#getPartition(dbName, tableName, 20181219). If the partition does not exists, then…
xingbin
  • 27,410
  • 9
  • 53
  • 103
8
votes
2 answers

AWS Glue Data Catalog as Metastore for external services like Databricks

Let's say, the datalake is on AWS. Using S3 as storage and Glue as data catalog. So, we can easily use athena, redshift or EMR to query data on S3 using Glue as metastore. My question is, is it possible to expose Glue data catalog as metastore for…
Obaid
  • 237
  • 2
  • 14
8
votes
2 answers

How to get column name and type in hive

I know of these, To get column names in a table we can fire: show columns in . To get description of a table (including column_name, column_type and many other details): describe [formatted] . I know…
Ani Menon
  • 27,209
  • 16
  • 105
  • 126
6
votes
3 answers

How to fix error on pyspark EMR Notebook - AnalysisException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

I am trying to run SQL queries using the spark.sql() or sqlContext.sql() method (here spark is the variable for SparkSession object available to us when we start EMR Notebook) on a public dataset using EMR notebook attached to an EMR cluster which…
6
votes
2 answers

Apache Spark 2.3.1 with Hive metastore 3.1.0

We have upgraded HDP cluster to 3.1.1.3.0.1.0-187 and have discovered: Hive has a new metastore location Spark can't see Hive databases In fact we see: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database ... not found Could…
Eugene Lopatkin
  • 2,351
  • 1
  • 22
  • 34
6
votes
2 answers

Where does the Hive data gets stored?

I am a little confused on where does the hive stores it's data. Does it stores it's data in HDFS or in a RDBMS ?? Does Hive Meta store uses a RDBMS to store the hive tables metadata ?? Thanks in Advance !!
Naman Agarwal
  • 614
  • 1
  • 8
  • 28
5
votes
0 answers

Spark SQL external table (hive support) - Find the location 'path' of the external (blob storage) table, within the metastore db

I have setup a standalone hive-metastore(v3.0.0) backed by postgres and created external tables within spark sql. The external data location is in azure blob. I am able to query these tables by using dbname.tablename instead of the actual location.…
bukli
  • 172
  • 2
  • 9
5
votes
1 answer

Can I use Cloud Dataproc with an external Hive Metastore?

By default, Cloud Dataproc runs a Hive Metastore local to the Dataproc cluster. This means: The metastore is ephemeral with the cluster It can be a pain to have multiple clusters using a single metastore Is it possible to point Dataproc clusters…
James
  • 2,321
  • 14
  • 30
5
votes
2 answers

How to install Hive Metastore in Kubernetes?

I am working on a project on Kubernetes where I use Spark SQL to create tables and I would like to add partitions and schemas to an Hive Metastore. However, I did not found any proper documentation to install Hive Metastore on Kubernetes. Is it…
Yassir S
  • 1,032
  • 3
  • 21
  • 44
1
2 3
16 17