Questions tagged [hiveql]

Variant of SQL used in the Apache Hive data warehouse infrastructure. Use this tag for questions related to the Hive Query Language including HiveQL syntax and HiveQL-specific functions.

HiveQL is the dialect of used in Apache Hive. HiveQL is similar to other dialects of SQL, but has some limitations (e.g. absence of nested subqueries and top-level disjuncts) and additional features (e.g. MapJoin), compared to most other dialects. Apache Hive is a infrastructure package built on the distributed-computing framework, written in Java.

Please read How to write good SQL question section in SQL tag wiki and follow the same rules.

References

4862 questions
1
vote
1 answer

How can we truncate text after space in Hadoop?

I have a column say column_1 and its values are: abc 12edf hbnm 847 47sf hg41 I need the output as follows: abc hbnm 47sf PS: I have read only access to the db
1
vote
1 answer

handle special characters in hive using regex

I have data which comes up like this. +--------------+--------------+-------------+------------+----------+--------------+-----------------+-----------------+-----------------+-----------------------+---------------------------+----------+ | …
Irthiza Khan
  • 23
  • 1
  • 7
1
vote
0 answers

Handling Nulls while calculating percentiles in Hive

Am having some troubles in handling nulls while calculating percentiles. Below is the sample data. enter image description here Code that am using now: percentile(column_1, array(0, 0.25, 0.50, 0.75, 1)) as column_1_p Here it considers null values…
kumar
  • 33
  • 6
1
vote
1 answer

hive - Is it possible to create columns from Map

I am seeking some advise on understanding the possibility of converting Map in hive to columns (transpose) Some sample records as and how i like to represent them are given…
linux developer
  • 821
  • 1
  • 13
  • 34
1
vote
0 answers

Performance of joining SCD Type 2 tables in Hive

I have a 2 tables in Hive which are managed using SCD Type 2 (https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row). Basically, each record has 2 columns valid_start_date and valid_end_date to specify the period during which…
1
vote
1 answer

How to count records from multiple columns eliminating null values in hive table

I'm using the below command to find the sum of records from 8 columns but getting null in the O/P as shown below. Command part 1 command part 2 Output How can this be fixed?
Inban
  • 33
  • 6
1
vote
2 answers

How do I concatenate fragmented messages of strings that are out of order in SQL

I have a table with three columns the first column indicates a message ID (message_id) the second column represents an ordinal feature which indicates the order of the message (message_order), lastly the third column is a fragment of the…
Jose
  • 31
  • 5
1
vote
1 answer

Hive : Expand a single record into multiple records when column values are separated by delimiter (~)

I have the following information id user date test 102 123~456~897 01JAN2019~6JUL2018~21DEC2017 abc~qwer~~wer~ 103 13~45~88~34 01JAN2020~6JUN2019~21JAN2018~12MAR2017…
shanmukh
  • 57
  • 5
1
vote
1 answer

Hive SQL aggregate merge multiple sqls into one

I have a serial sqls like: select count(distinct userId) from table where hour >= 0 and hour <= 0; select count(distinct userId) from table where hour >= 0 and hour <= 1; select count(distinct userId) from table where hour >= 0 and hour <=…
Calix
  • 11
  • 1
1
vote
0 answers

Is there any Source Code Analysis/Quality tool for HiveQL?

I googled a lot to get Source Code Analyzer for HiveQL but didn't find any such tool. I have checked all the plugins in SonarQube but there is no plugin for HiveQL. Is there any Source Code Analysis/Quality tool for HiveQL?
Ash
  • 33
  • 1
  • 6
1
vote
1 answer

Convert base64 to readable string

I have a table having nvarchar column in SQL server 2016 that I want to store in hive. The nvarchar column can have non-ASCII characters. The data from SQL server is extracted in a file with the nvarchar column converted to base64 coded string. I…
pawinder gupta
  • 1,225
  • 16
  • 35
1
vote
1 answer

Find repeated users from month to month

I have data like this below First-month transaction User_id trsaction_completed_date user_type 1234 7-Jan-19 New 5657 8-Jan-19 New 7890 9-Jan-19 …
sana
  • 11
  • 2
1
vote
2 answers

How to convert "2019-11-02T20:18:00Z" to timestamp in HQL?

I have datetime string "2019-11-02T20:18:00Z". How can I convert it into timestamp in Hive HQL?
linrongbin
  • 2,967
  • 6
  • 31
  • 59
1
vote
2 answers

combine multiple group by queries into single query

I have a table similar in structure to the one below, I have four grouping variables (Region,Site,Method,Client) and three metrics that each have an accompanying "bucket" column that gives the quintile of the metric. I would like to perform some…
hghg hghg
  • 127
  • 9
1
vote
1 answer

Is there a way to give a "second name" to a table in Hive so that a user can refer to either name of the table and would retrieve the same thing?

I would like to be able to refer to tables with a certain naming schema to make my code uniform, but I am pulling tables from different environments with different naming schema. If I want all my tables to have names like example_table_1 and…
bw1997
  • 37
  • 3
1 2 3
99
100