16

I am connecting to Hive via an ODBC driver from a .NET application. Is there a query to determine if a table already exists?

For example, in MSSQL you can query the INFORMATION_SCHEMA table and in Netezza you can query the _v_table table.

Any assistance would be appreciated.

Michael Robinson
  • 1,985
  • 2
  • 21
  • 31

7 Answers7

16

Execute the following command : show tables in DB like 'TABLENAME'
If the table exists, its name will be returned, otherwise nothing will be returned.
This is done directly from hive. for more options see this.

DB is the database in which you want to see if the table exists.
TABLENAME is the table name you seek,

What actually happens is that Hive queries its metastore (depends on your configuration but it can be in a standard RDBMS like MySQL) so you can optionally connect directly to the same metastore and write your own query to see if the table exists.

dimamah
  • 2,883
  • 18
  • 31
12

There are two approaches by which you can check that:

1.) As @dimamah suggested, just to add one point here, for this approach you need to

 1.1) start the **hiveserver** before running the query
 1.2) you have to run two queries
      1.2.1) USE <database_name>
      1.2.2) SHOW TABLES LIKE 'table_name'
      1.2.3) Then you check your result using Result set.

2.) Second approach is to use HiveMetastoreClient APIs, where you can directly use the APIs to check whether the table_name exist in a particular database or not.

For further help please go through this Hive 11

Mukesh S
  • 2,856
  • 17
  • 23
  • The link for "Hive 11" is now broken (404 Not Found). Do you know where is the updated location of the said documentation? I also looked at this doc but seems the info there is limited: https://cwiki.apache.org/confluence/display/Hive/Hive+APIs+Overview#HiveAPIsOverview-Metastore(Java) – leeyuiwah Feb 21 '18 at 21:45
12

When programming on Hive by Spark SQL, you can use following method to check whether Hive table exists.

if (hiveContext.hql("SHOW TABLES LIKE '" + tableName + "'").count() == 1) {
    println(tableName + " exists")
}
Fokko Driesprong
  • 2,075
  • 19
  • 31
Haimei
  • 12,577
  • 3
  • 50
  • 36
  • 1
    Spark is depreciating hql... will this solution be guaranteed to work using a regular hive context with SQL? Is there a more general Spark method for looking at Hive meta-information, other than running a query? – Glenn Strycker Jun 08 '15 at 19:28
  • 1
    Also -- using Spark 1.3.0, I get the following error: " User class threw exception: [1.13] failure: ``in'' expected but identifier like found show tables like 'sample_table_name'; " I also get this error when I do include "in " before the table name – Glenn Strycker Jul 09 '15 at 21:53
  • 1
    Spark 1.5. Works with sparkContext too, not just Hive Context. table_exists = len(sqlCtx.sql("SHOW TABLES IN "+ stg_db).filter("tableName='%s'" % stg_tab).collect()) == 1 – Tagar Mar 10 '16 at 05:32
  • In Spark 2+ we can use `sparkSession.sqlContext.tableNames(databaseName="default")` to get the list of tables in a given db_schema – CᴴᴀZ Aug 06 '21 at 13:12
9

If someone is using shell script like me then my answer could be useful. Assume that your table is in the default namespace.

table=your_hive_table
validateTable=$(hive --database default -e "SHOW TABLES LIKE '$table'")
if [[ -z $validateTable ]]; then
  echo "Error:: $table cannot be found"
  exit 1
fi
Alex Raj Kaliamoorthy
  • 2,035
  • 3
  • 29
  • 46
  • Due to HIVE-4364 bug this doesn't work with beeline unforunatally, +1 anyway. – Tagar Sep 20 '16 at 16:17
  • Beeline -> https://community.cloudera.com/t5/Support-Questions/How-to-check-the-existence-of-the-table-in-Beeline-Hive/m-p/181220/highlight/true#M143449 – Alex Sep 02 '20 at 19:59
  • Note: for exact table match -> grep -w instead of grep should be used in the link above. – Alex Sep 02 '20 at 20:10
6

If you're using SparkSQL you can do the following.

if "table_name" in sqlContext.tableNames("db_name"):
    ...do something

http://spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html#pyspark.sql.SQLContext.tableNames

3

Code similar to below one can find in many of my Spark notebooks:

stg_table_exists = sqlCtx.sql("SHOW TABLES IN "+ stg_db) 
                  .filter("tableName='%s'" % stg_tab_name) .collect()

(made two-liner for readability)

I wish Spark would have an API call to check the same.

Tagar
  • 13,911
  • 6
  • 95
  • 110
0

If you're using a scala spark app and SparkSQL you can do the following

if spark.catalog.tableExists("tablename") {do something}
Tim
  • 189
  • 1
  • 14