Questions tagged [hiveql]

Variant of SQL used in the Apache Hive data warehouse infrastructure. Use this tag for questions related to the Hive Query Language including HiveQL syntax and HiveQL-specific functions.

HiveQL is the dialect of used in Apache Hive. HiveQL is similar to other dialects of SQL, but has some limitations (e.g. absence of nested subqueries and top-level disjuncts) and additional features (e.g. MapJoin), compared to most other dialects. Apache Hive is a infrastructure package built on the distributed-computing framework, written in Java.

Please read How to write good SQL question section in SQL tag wiki and follow the same rules.

References

4862 questions
1
vote
0 answers

Hive column having complex data structure giving null values after reading the parquet data

After reading the parquet data as dataframe, it looks as shown in below image: But now I want to read the parquet data using hive. I wrote the create table statement as : CREATE EXTERNAL TABLE test_13102019 ( `DeviceInfo` STRUCT< …
Keen_Learner
  • 87
  • 1
  • 8
1
vote
2 answers

Presto - Concat multiple tables using unique identifier

I have multiple tables in the following format: table users - ID lang 1 EN 2 EN 3 DE table A - ID event1 event2 1 5 1 2 null 1 3 11 null table B - ID event1 event10 1 2 1 3 2 null so after concat/join the tables on ID…
Andreyn
  • 304
  • 5
  • 14
1
vote
1 answer

How to go from long to wide structure in HIVE?

I have a hive data that looks like this data in long format I'd like to restructure it to look like this data in wide format What I have looks like this my latest attempt This is the code that I've used. I'd like to get rid of all the NULL's and…
CeC
  • 85
  • 10
1
vote
1 answer

how to get to_epoch(sysdate-90) in Hive

I have a query which runs good in Oracle, but i want to use the same query in Hive. query: select count(mem_id) from mem where cobrand_id = '10001372' and user_type_id =1 and status_ind <>3 and LAST_ACCESSED >= to_epoch(sysdate-90); Also, I…
Somanath Patil
  • 35
  • 2
  • 10
1
vote
1 answer

How to generate a random date from a given range with hive

use hive . I need to generate random dates selected from a given date range, between '2019-01-01' and '2019-10-31'. Can any one guide me with my query?
evan
  • 15
  • 3
1
vote
1 answer

Hive compactions not triggered automatically - HDP_2.6.5

I created a table in hive to test compaction, CREATE TABLE part_test(id int, name string, city string) PARTITIONED BY (dept string) clustered by (city) into 5 buckets stored as orc TBLPROPERTIES('transactional'='true'); In the screenshot we can…
1
vote
2 answers

how to extract a json object from list of json based on key in apache hive

I need to extract a json from below list of json based on id field value and display the key as columns…
Imran
  • 121
  • 4
  • 9
1
vote
2 answers

Hive Combine Multiple Map Values into One column

Table (Input data) +--------+---------+ | col_id | col_val | +--------+---------+ | axc | col_x | | bdf | col_f | | cde | col_x | | yhc | col_f | | idx | col_a | | dft | col_b | +--------+---------+ Tranpose logic.…
sqllearn
  • 27
  • 5
1
vote
1 answer

Subtract 3 months to yyyymm month format column hive sql

In HiveSql I have a yearmonth [yyyymm] column from which I need to subtract 3 months For example: if yearmonth is 201912 , the record required is 201909 Can someone please help me with the syntax or script I need to get for this ? I have tried…
1
vote
1 answer

Clarification on sortby vs order by in hive

I am going through below Hive manual and confused by the details explained on documentation https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy First it says Hive uses the columns in SORT BY to sort the rows before feeding the …
user2895589
  • 1,010
  • 4
  • 20
  • 33
1
vote
1 answer

Picking minValue and its row in hive

I have to pick the minValue over a sliding date window of 2 hours and its corresponding date value. For example Create table stock(time string, cost float); Insert into stock values("1990-01-01 8:00 AM",4.5); Insert into stock values("1990-01-01…
Raj
  • 401
  • 6
  • 20
1
vote
2 answers

sql regexp string end with ".0"

I want to judge if a positive number string is end with ".0", so I wrote the following sql: select '12310' REGEXP '^[0-9]*\.0$'. The result is true however. I wonder why I got the result, since I use "\" before "." to escape. So I write another one…
Qing Guo
  • 33
  • 5
1
vote
1 answer

Hive lag window partition

Here is my table : sensor_name, ext_value, int_value, growth 47ACXVMACSENS01, 238, 157, 1 47ACXVMACSENS01, 157, 256, 2 47ACXVMACSENS01, 895, 345, 3 47ACXVMACSENS01, 79, 861, 3 91DKCVMACSENS02, 904, 858, 1 91DKCVMACSENS02, 925, 588,…
Talia9
  • 13
  • 2
1
vote
1 answer

Hive: Reduce millisecond precision in timestamp

In Hive, is there anyway to reduce millisecond precision (not rounding)? For example I have the following timestamp with millisecond in 3 decimal places 2019-10-08 21:21:39.163 I want to get a timestamp exactly in 1 decimal place (remove two last…
Long Le
  • 404
  • 5
  • 18
1
vote
4 answers

How to make hive return rows faster? (say, limit 10)

I noticed that when MR is involved, even for simply queries like: select * from table where condition_a and condition_b limit 10; Hive will scan all the data (sure), but it will only return when all the MR jobs finish, even we only need 10…
kcode2019
  • 119
  • 1
  • 7
1 2 3
99
100