Questions tagged [hiveql]

Variant of SQL used in the Apache Hive data warehouse infrastructure. Use this tag for questions related to the Hive Query Language including HiveQL syntax and HiveQL-specific functions.

HiveQL is the dialect of used in Apache Hive. HiveQL is similar to other dialects of SQL, but has some limitations (e.g. absence of nested subqueries and top-level disjuncts) and additional features (e.g. MapJoin), compared to most other dialects. Apache Hive is a infrastructure package built on the distributed-computing framework, written in Java.

Please read How to write good SQL question section in SQL tag wiki and follow the same rules.

References

4862 questions
19
votes
8 answers

Dropping multiple tables with same prefix in Hive

I have few tables in hive that has same prefix like below.. temp_table_name temp_table_add temp_table_area There are few hundreds of tables like this in my database along with many other tables. I want to delete tables that starts with…
Alex Raj Kaliamoorthy
  • 2,035
  • 3
  • 29
  • 46
19
votes
1 answer

Hive Select Into

I have a database people in hive. It's schema is as follows: name string, dob_date int, dob_month int, dob_year int. I have successfully loaded data from a file into the database. Now I want to have people having dob_year=1990 into a new table. The…
Nemil A Timbadia
  • 319
  • 1
  • 3
  • 6
17
votes
3 answers

Delete a database with tables in Hive

I have a database in hive which has around 100 tables. I would like to delete the whole database in a single shot query. How can we achieve that in Hive?
user7351648
17
votes
1 answer

Apache Hive How to round off to 2 decimal places?

Actually I'm looking for more details about the sum function in Apache Hive. Until now, I understood that I can specify the number of digits after the dot: val DECIMAL(18, 3) But what I can not find is the precision scale in case of sum. If I add 2…
Régis NIOX
  • 741
  • 2
  • 12
  • 30
17
votes
2 answers

Hive dynamic partitioning

I'm trying to create a partitioned table using dynamic partitioning, but i'm facing an issue. I'm running Hive 0.12 on Hortonworks Sandbox 2.0. set hive.exec.dynamic.partition=true; INSERT OVERWRITE TABLE demo_tab PARTITION (land) SELECT stadt,…
Baeumla
  • 443
  • 3
  • 6
  • 18
16
votes
3 answers

HiveQL UNION ALL

I have table_A: id var1 var2 1 a b 2 c d Table_B: id var1 var2 3 e f 4 g h All I want is table, combined: id var1 var2 1 a b 2 c d 3 e f 4 g h This is my .hql: CREATE TABLE combined AS SELECT all.id,…
dum_dum_dummy
  • 161
  • 1
  • 1
  • 3
15
votes
4 answers

Spark 2: how does it work when SparkSession enableHiveSupport() is invoked

My question is rather simple, but somehow I cannot find a clear answer by reading the documentation. I have Spark2 running on a CDH 5.10 cluster. There is also Hive and a metastore. I create a session in my Spark program as follows: SparkSession…
Anthony Arrascue
  • 220
  • 1
  • 2
  • 13
15
votes
3 answers

Hive How to select all but one column?

Suppose my table looks something like: Col1 Col2 Col3.....Col20 Col21 Now I want to select all but Col21. I want to change it to unix_timestamp() before I insert into some other table. So the trivial approach is to do something like: INSERT INTO…
Rocking chief
  • 1,039
  • 3
  • 17
  • 31
15
votes
1 answer

Concat multiple rows with a delimiter in Hive

I need to concat string values row wise with '~' as delimiter. I have the following data: I need to concat 'Comment' column for each 'id' in the ascending order of 'row_id' with '~' as delimiter. Expected output is as below: GROUP_CONCAT is not an…
Vaishak
  • 607
  • 3
  • 8
  • 30
15
votes
3 answers

Is LIMIT clause in HIVE really random?

The documentation of HIVE notes that LIMIT clause returns rows chosen at random. I have been running a SELECT table on a table with more than 800,000 records with LIMIT 1, but it always return me the same record. I'm using the Shark distribution,…
visakh
  • 2,503
  • 8
  • 29
  • 55
15
votes
6 answers

Hive error: parseexception missing EOF

I am not sure what I am doing wrong here: hive> CREATE TABLE default.testtbl(int1 INT,string1 STRING) stored as orc tblproperties ("orc.compress"="NONE") LOCATION "/user/hive/test_table"; FAILED: ParseException line…
lex
  • 1,701
  • 4
  • 19
  • 31
14
votes
3 answers

hive Expression Not In Group By Key

I create a table in HIVE. It has the following columns: id bigint, rank bigint, date string I want to get avg(rank) per month. I can use this command. It works. select a.lens_id, avg(a.rank) from tableA a group by a.lens_id, year(a.date_saved),…
chnet
  • 1,993
  • 9
  • 36
  • 51
14
votes
2 answers

Difference between 'Stored as InputFormat, OutputFormat' and 'Stored as' in Hive

Issue when executing a show create table and then executing the resulting create table statement if the table is ORC. Using show create table, you get this: STORED AS INPUTFORMAT ‘org.apache.hadoop.hive.ql.io.orc.OrcInputFormat’ OUTPUTFORMAT …
Jason
  • 173
  • 1
  • 1
  • 8
14
votes
3 answers

Both left and right aliases encountered in Hive JOIN; without any inequality clause

I am using following query: Select S.MDSE_ITEM_I, S.CO_LOC_I, MAX(S.SLS_D) as MAX_SLS_D, MIN(S.SLS_D) as MIN_SLS_D, sum(S.SLS_UNIT_Q) as SLS_UNIT_Q, MIN(PRSMN_VAL_STRT_D) as PRSMN_VAL_STRT_D, MIN(PRSMN_VAL_END_D) as…
abhiieor
  • 3,132
  • 4
  • 30
  • 47
14
votes
2 answers

HIVE - INSERT OVERWRITE vs DROP TABLE + CREATE TABLE + INSERT INTO

I'm doing some automatic script of few queries in hive and we found that we need time to time clear the data from a table and insert the new one. And we are thinking what could be faster? INSERT OVERWRITE TABLE SOME_TABLE SELECT * FROM…
Thiago Baldim
  • 7,362
  • 3
  • 29
  • 51