11

In Hive, how do we search a table by name in all databases?

I am a Teradata user. Is there any counterpart of systems tables (present in Teradata) like dbc.tables, dbc.columns which are present in HIVE?

v83rahul
  • 283
  • 2
  • 7
  • 20

7 Answers7

12

You can use SQL like to search a table. Example: I want to search a table with the name starting from "Benchmark" I don't know the rest of it.

Input in HIVE CLI:

show tables like 'ben*'

Output:

+-----------------------+--+
|       tab_name        |
+-----------------------+--+
| benchmark_core_month  |
| benchmark_core_qtr    |
| benchmark_core_year   |
+-----------------------+--+
3 rows selected (0.224 seconds)

Or you can try below command if you are using Beeline

!tables

Note: It will work with Beeline only (JDBC client based)

More about beeline: http://blog.cloudera.com/blog/2014/02/migrating-from-hive-cli-to-beeline-a-primer/

Mantej Singh
  • 392
  • 4
  • 8
5

you can also use hdfs to find a table in all databases:

the path of hive databases is:

/apps/hive/warehouse/

so, by using hdfs :

hdfs dfs -find /apps/hive/warehouse/ -name t*
HISI
  • 4,557
  • 4
  • 35
  • 51
  • How to check the path of hive database in my system? I don't see such a path you have mentioned above – Dileep Dominic Nov 01 '18 at 11:12
  • This path is not in the OS file system but is in the HDFS file system, you should run hdfs dfs -ls /apps/hive/warehouse/, you should also check the access permission if you are able to read this path – HISI Nov 02 '18 at 17:57
3

You should query the metastore.

You can find the connection properties within hive-site.xml

bash

<$HIVE_HOME/conf/hive-site.xml grep -A1 jdo

    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://127.0.0.1/metastore?createDatabaseIfNotExist=true</value>
--
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
--
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
--
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>cloudera</value>

Within the metastore you can use a query similar to the following

mysql

select          *

from                            metastore.DBS             as d

                join            metastore.TBLS            as t 

                on              t.DB_ID =
                                d.DB_ID  

where           t.TBL_NAME  like '% ... put somthing here ... %'

order by       d.NAME 
              ,t.TBL_NAME                                          
;
David דודו Markovitz
  • 42,900
  • 6
  • 64
  • 88
2

Hive stores all its metadata information in Metastore. Metastore schema can be found at: link: https://issues.apache.org/jira/secure/attachment/12471108/HiveMetaStore.pdf

It has tables like DBS for database, TBLS for tables and Columns. You may use appropriate join, to find out table name or column names.

Avi C
  • 56
  • 1
2

Searching for tables with name containing infob across all Hive databases

for i in `hive -e "show schemas"`; do echo "Hive DB: $i"; hive -e "use $i; show tables"|grep "infob"; done
2

That's an extention of Mantej Singh's answer: you can use pyspark to find tables across all Hive databases (not just one):

from functools import reduce
from pyspark import SparkContext, HiveContext
from pyspark.sql import DataFrame

sc = SparkContext()
sqlContext = HiveContext(sc)

dbnames = [row.databaseName for row in sqlContext.sql('SHOW DATABASES').collect()]

tnames = []
for dbname in dbnames:
    tnames.append(sqlContext.sql('SHOW TABLES IN {} LIKE "%your_pattern%"'.format(dbname)))

tables = reduce(DataFrame.union, tnames)
tables.show()

The way to do this is to iterate through the databases searching for table with a specified name.

Kate V.
  • 21
  • 1
1

@hisi's answer is elegant. However it induce an error with lacking memory for GC on our cluster. So, there is another less elegant approach that works for me.

Let foo is the table name to search. So

hadoop fs -ls -R -C /apps/hive/warehouse/ 2>/dev/null | grep '/apps/hive/warehouse/[^/]\{1,\}/foo$'

If one does not remember exact name of table but only substring bar in table name, then command is

hadoop fs -ls -R -C /apps/hive/warehouse/ 2>/dev/null | grep '/apps/hive/warehouse/[^/]\{1,\}/[^/]\{1,\}$' | grep bar
Oleg Svechkarenko
  • 2,508
  • 25
  • 30