Questions tagged [hiveql]

Variant of SQL used in the Apache Hive data warehouse infrastructure. Use this tag for questions related to the Hive Query Language including HiveQL syntax and HiveQL-specific functions.

HiveQL is the dialect of used in Apache Hive. HiveQL is similar to other dialects of SQL, but has some limitations (e.g. absence of nested subqueries and top-level disjuncts) and additional features (e.g. MapJoin), compared to most other dialects. Apache Hive is a infrastructure package built on the distributed-computing framework, written in Java.

Please read How to write good SQL question section in SQL tag wiki and follow the same rules.

References

4862 questions
1
vote
1 answer

Aggregating data where conditions are stored in another table

I have two tables. I for each date and category pair in the table2 I want to calculate the past week and two-week record counts for that category calculated from that day. table1 | DATE |Category | |--------------|---------| |2018-10-01 …
Coderzz
  • 55
  • 7
1
vote
2 answers

why is delete function not working in hive shell?

hive> delete from daily_case where num_casedaily=0; FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations. thank you in advance.
1
vote
1 answer

Concat and then group by in Hive

I have 3 columns in a table as given below: |---------------------|------------------|-------------| | dept | class | item | |---------------------|------------------|-------------| | 234 | 34…
Reema
  • 33
  • 5
1
vote
1 answer

indexOf alike function in HIVEQL

I have the following query: SELECT col1, case when array_contains(col1, "c") then "c exists" end as col2 FROM ( SELECT * FROM ( SELECT array("a","b","c") AS col1 ) q1 ) q2; I want…
agustin
  • 1,311
  • 20
  • 42
1
vote
1 answer

After changing column name in hive, value of column are getting NULL

Working on hive table, where I need to change column name as below, its working as expected and changing column name but underline value of this column getting NULL. ALTER TABLE db.tbl CHANGE hdfs_loaddate hdfs_load_date String; Here changed column…
Ajay Kharade
  • 1,469
  • 1
  • 17
  • 31
1
vote
1 answer

Hive Delimiter using :

I want to extract a column A that has values such as W:X:Y:Z. I am interested to extract Z from Column A. I tried multiple commands such as SPLIT(Table.A, "[:]"[3] ) but get an error. What is the best way to do this?
pp2000
  • 35
  • 1
  • 2
  • 6
1
vote
1 answer

If a value is not found in a Hive table/ DataFrame, then get the next largest value

I have two Hive tables/ Spark Dataframes A and B A --------+----------+------+ product | date | id | --------+----------+------+ A | 20200201 | X | --------+----------+------+ B | 20200301 | Y | …
Amber
  • 914
  • 6
  • 20
  • 51
1
vote
1 answer

HIVE - insert result of a select statement as multiple records into a hive table without overwriting existing content

I have a table from following command: CREATE TABLE treatment_costs AS SELECT * FROM (SELECT r.patient_id, r.transaction_date, r.paid_transaction_amount, o.dob, o.department_name, o.reason_of_visit FROM ReceiptTransactions r LEFT OUTER JOIN…
Naveen Reddy Marthala
  • 2,622
  • 4
  • 35
  • 67
1
vote
1 answer

Count the number of transactions per month for an individual group by date Hive

I have a table of customer transactions where each item purchased by a customer is stored as one row. So, for a single transaction there can be multiple rows in the table. I have another col called visit_date. There is a category column called…
krishna koti
  • 659
  • 1
  • 6
  • 10
1
vote
1 answer

Hive tablesampling and bucketing

I'm new to Hive and facing some problem. I'm learning bucketing right now and my task is to create a Hive table that consists of 2 buckets, then put at least 5 records into that table. Well, that part is clear I think: CREATE TABLE (id…
ArnAuth
  • 13
  • 2
1
vote
1 answer

Parse and extract fields from an url field in HiveQL

I have this data from a…
Chema
  • 2,748
  • 2
  • 13
  • 24
1
vote
0 answers

Hive : Filter array of maps

I have a hive table column as col1 array> this column contains values as [ {'a':1,'b':2}, {'a':1,'c':3}, . . . ] now the keys in the inner map can vary per item. I want to filter the array to…
AbtPst
  • 7,778
  • 17
  • 91
  • 172
1
vote
1 answer

Increase the performance of the insert overwrite in hive managed table

I'm new to Hive and I wanted to know the list of table properties to increase the performance of the insert overwrite in hive managed table. Can someone help with that?
Shadab Hussain
  • 794
  • 6
  • 24
1
vote
1 answer

Hive: how to eliminate the duplicated substrings

Hive table: create table T (i int, s string); insert into T values (1, "a1&b2"), (1, "b2&c3"), (2, "c1&d2"), (2, "c1"); The s column contains the values separated by & Desired output should group by 1st column and concatenate the s…
user510040
  • 159
  • 2
  • 10
1
vote
1 answer

Find rows by latest date and time of distinct few column values in a large data set

I have date in below format. I want to find the whether the latest snacks eaten by serial number 123 and 101 were veg, non veg or vegan. I tried : SELECT serial_number,Name,event_ts_gmt FROM TT T1 INNER JOIN ( SELECT …