Questions tagged [hiveql]

Variant of SQL used in the Apache Hive data warehouse infrastructure. Use this tag for questions related to the Hive Query Language including HiveQL syntax and HiveQL-specific functions.

HiveQL is the dialect of used in Apache Hive. HiveQL is similar to other dialects of SQL, but has some limitations (e.g. absence of nested subqueries and top-level disjuncts) and additional features (e.g. MapJoin), compared to most other dialects. Apache Hive is a infrastructure package built on the distributed-computing framework, written in Java.

Please read How to write good SQL question section in SQL tag wiki and follow the same rules.

References

4862 questions
11
votes
1 answer

What is the difference between 'InputFormat, OutputFormat' & 'Stored as' in Hive?

Im new to Bigdata and currently learning Hive. I understood the concept of InputFormat & OutputFormat in Hive as part of SerDe. I also understood that 'Stored as' is used to store a file in a particular format just like InputFormat. But I don't…
Metadata
  • 2,127
  • 9
  • 56
  • 127
11
votes
1 answer

How does the percentile function work in Hive?

Imagine the following column calledd id: 68 69 43 54 56 61 69 70 71 72 77 78 79 85 87 88 89 93 95 96 98 99 99 62 66 If I do the following: percentile(id, 0.9), the output is 97.2. What is going on?
Pratik Garg
  • 747
  • 2
  • 9
  • 21
11
votes
2 answers

HIVE insert overwrite directory with json format

How can I insert overwrite directory with json schema? There is raw hive avro table; (this is actually has many fields) tb_test-------- name string kickname string ----------------- then I want to save query result into some directory in hdfs by…
Jihun No
  • 1,201
  • 1
  • 14
  • 29
11
votes
2 answers

Hive's unix_timestamp and from_unixtime functions

I am under the impression that unix_timestamp and from_unixtime Hive functions are 'reverse' of each other. When I try to convert timestamp string to seconds in Hive: SELECT unix_timestamp('10-Jun-15 10.00.00.000000 AM', 'dd-MMM-yy hh.mm.ss.MS…
oikonomiyaki
  • 7,691
  • 15
  • 62
  • 101
11
votes
1 answer

Hive UDF for selecting all except some columns

The common query building pattern in HiveQL (and SQL in general) is to either select all columns (SELECT *) or an explicitly-specified set of columns (SELECT A, B, C). SQL has no built-in mechanism for selecting all but a specified set of columns.…
Sim
  • 13,147
  • 9
  • 66
  • 95
11
votes
3 answers

How to see the date when the table was created?

I have created a table couple months ago. Is there any way in HIVE that I can see when was the table created? show table doesn't give the date creation of the table.
sharp
  • 2,140
  • 9
  • 43
  • 80
11
votes
1 answer

Hive: Is it possible to rename an existing hive database?

There seems to be a similar requirement at: https://issues.apache.org/jira/browse/HIVE-4847 However, when I tried the proposed syntax from the issue's Description - 'ALTER DATABASE dbname RENAME TO newdbname', it gave me the following error: ALTER…
activelearner
  • 7,055
  • 20
  • 53
  • 94
11
votes
1 answer

Error in Hive Query while joining tables

I am unable to pass the equality check using the below HIVE query. I have 3 table and i want to join these table. I trying as below, but get error : FAILED: Error in semantic analysis: Line 3:40 Both left and right aliases encountered in JOIN…
Agustus
  • 634
  • 1
  • 7
  • 24
11
votes
3 answers

Hive Query Execution Error, return code 3 from MapredLocalTask

I am getting this error while performing a simple join between two tables. I run this query in Hive command line. I am naming table as a & b. Table a is Hive internal table and b is External table (in Cassandra). Table a has only 1610 rows and Table…
user3517633
  • 111
  • 1
  • 1
  • 4
11
votes
3 answers

Offset functionality in Hive

How can I achieve the same functionality as SQL's "offset" in Hive? SELECT * from table LIMIT 20 OFFSET 30 Thanks!
Jieren
  • 1,952
  • 4
  • 18
  • 26
11
votes
2 answers

Hive - external (dynamically) partitioned table

I'm having a table in MySQL viz. nas_comps. select comp_code, count(leg_id) from nas_comps_01012011_31012011 n group by comp_code; comp_code count(leg_id) 'J' 20640 'Y' 39680 First, I imported data onto HDFSHadoop version…
Kaliyug Antagonist
  • 3,512
  • 9
  • 51
  • 103
11
votes
4 answers

Hive doesn't support in, exists. How do I write the following query?

I have two tables A and B that both have a column id. I wish to obtain ids from A that are not present in B. The obvious way is: SELECT id FROM A WHERE id NOT IN (SELECT id FROM B) Unfortunately, Hive doesn't support in, exists or subqueries. Is…
elexhobby
  • 2,588
  • 5
  • 24
  • 33
11
votes
1 answer

Map type variable in hive

I am having trouble trying to define map type in hive. According to Hive Manual there definitely is a map type, unfortunately there aren't any examples on how to use it. :-( Suppose, I have a table (users) with following columns: Name Ph …
test123
  • 13,865
  • 9
  • 28
  • 33
11
votes
5 answers

Date comparison in Hive

I'm working with Hive and I have a table structured as follows: CREATE TABLE t1 ( id INT, created TIMESTAMP, some_value BIGINT ); I need to find every row in t1 that is less than 180 days old. The following query yields no rows even though…
anon
10
votes
1 answer

TIMESTAMP format issue in HIVE

I have Hive table created from JSON file. CREATE external TABLE logan_test.t1 ( name string, start_time timestamp ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' WITH SERDEPROPERTIES ( "timestamp.formats" =…
logan
  • 7,946
  • 36
  • 114
  • 185