0

I want to fetch all databases and its related tables and columns information. Obviously I can do it from metastore. But I don't have access to it . So is there any other way instead of querying each database one by one.

Vivz
  • 6,625
  • 2
  • 17
  • 33
Mayank Pande
  • 150
  • 7

2 Answers2

1

you'd need python but i did it with this:

databases = run_hive_query('show schemas')
databases = list(databases.database_name)

schema = {'DB':[],
      'Table':[],
      'Column':[], 
      'DataType':[]}
for db in databases:
tables = run_hive_query( 'show tables from ' +db)
tables = list(tables.tab_name)
for tb in tables:
    try:
        columns = (run_hive_query('desc ' + db+'.'+tb))
        print(db + '  '+ tb)
    except:
        print('failed'+db + '  '+ tb)
    try:
        for x in range(columns.shape[0]):
            schema['DB'].append(db) 
            schema['Table'].append(tb)
            schema['Column'].append(columns.iloc[x][0])
            schema['DataType'].append(columns.iloc[x][1])
    except:
        print('failed'+db + '  '+ tb)    
0

You should be able to run the following commands. I guess you could script it up to run across all databases and all tables

SHOW DATABASES;
SHOW TABLES;
DESCRIBE <table_name>;
Dave Whittingham
  • 338
  • 3
  • 11