Questions tagged [hiveql]

Variant of SQL used in the Apache Hive data warehouse infrastructure. Use this tag for questions related to the Hive Query Language including HiveQL syntax and HiveQL-specific functions.

HiveQL is the dialect of used in Apache Hive. HiveQL is similar to other dialects of SQL, but has some limitations (e.g. absence of nested subqueries and top-level disjuncts) and additional features (e.g. MapJoin), compared to most other dialects. Apache Hive is a infrastructure package built on the distributed-computing framework, written in Java.

Please read How to write good SQL question section in SQL tag wiki and follow the same rules.

References

4862 questions
9
votes
1 answer

Hive Map-Join configuration mystery

Could someone clearly explain what is the difference between hive.auto.convert.join and hive.auto.convert.join.noconditionaltask configuration parameters? Also these corresponding size parameters: hive.mapjoin.smalltable.filesize and…
leftjoin
  • 36,950
  • 8
  • 57
  • 116
9
votes
2 answers

How to extract selected values from json string in Hive

I am running a simple query in Hive that produces the following output (with a few other additional columns. |------|-----------------------------------------------------------| | col1 | col2 …
jimiclapton
  • 775
  • 3
  • 14
  • 42
9
votes
3 answers

SQL query Frequency Distribution matrix for product

i want to create a frequency distribution matrix 1.Create a matrix.**Is it possible to get this in separate columns** customer1 p1 p2 p3 customer 2 p2 p3 customer 3 p2 p3 p1 customer 4 …
coder25
  • 2,363
  • 12
  • 57
  • 104
9
votes
1 answer

SemanticException [Error 10265] while running simple hive select query on a transactional hive table

I created the table in hive: CREATE TABLE test_table (COL1 string, COL2 string, COL3 string, COL4 string) CLUSTERED BY(COL2) INTO 4 BUCKETS STORED AS ORC tblproperties("transactional"="true"); now trying to query using putty in hive prompt: select…
osum
  • 805
  • 2
  • 10
  • 32
9
votes
1 answer

Is there a way to transpose data in Hive

This is my table: pid high medium low 1 10 8 6 2 20 16 12 3 10 6 4 I want store this data in another table in Hive with the following format: pid priority value 1 …
priyanka
  • 287
  • 2
  • 5
  • 11
9
votes
6 answers

Hive query stuck at 99%

I am inserting records using left joining in Hive.When I set limit 1 query works but for all records query get stuck at 99% reduce job. Below query works Insert overwrite table tablename select a.id , b.name from a left join b on a.id = b.id…
user2895589
  • 1,010
  • 4
  • 20
  • 33
9
votes
2 answers

Search for a particular text in a string - Hive

/google/gmail/inbox /google/drive/map /google/apps /yahoo/news/cricket /yahoo/mail/ /yahoo/sports /wiki/ind/jack /wiki/us/jil I need to get the required page groups. If i search for page group starting with 'google' using hive query, I need…
Krishna
  • 153
  • 1
  • 5
  • 12
9
votes
3 answers

Can you explain when and why mapreduce is invoked in hive

select * from Table_name limit 5; select col1_name,col2_name from table_name limit 5; When i run the first query there will be no MapReduce invoked, while for other the MapReduce is invoked. Could you please explain the reason.
Saj
  • 53
  • 1
  • 1
  • 6
9
votes
1 answer

Hive Union Group By Error

In the following query, I am trying to count up the distinct, and total occurences of the column "primary" and and summarise this for each column. I have two source tables, which contain similar information. I want to union these to pull all the…
Tminer
  • 302
  • 2
  • 4
  • 14
9
votes
1 answer

Array literal in Hive

How do I write out an array literal in Hive? SELECT PERCENTILE(my_column, [0.5, 0.25, 0.50, 0.75, 0.95]) AS quantiles FROM my_table Returns the error FAILED: ParseException line xx:xx cannot recognize input near '[' '0.5' ',' in select expression
John McDonnell
  • 1,470
  • 2
  • 18
  • 19
9
votes
4 answers

Convert string to timestamp in Hive

I have the following string representation of a timestamp in my Hive table: 20130502081559999 I need to convert it to a string like so: 2013-05-02 08:15:59 I have tried following ({code} >>> {result}):…
ferics2
  • 5,241
  • 7
  • 30
  • 46
9
votes
1 answer

Grouping hive rows in an array of this rows

I have a table like the following : User:String Alias:String JohnDoe John JohnDoe JDoe Roger Roger And I would like to group all the aliases of an user in an array, in a new table which would look like this : User:String…
C4stor
  • 8,355
  • 6
  • 29
  • 47
9
votes
3 answers

Combine many tables in Hive using UNION ALL?

I'm trying to append one variable from several tables together (aka row-bind, concatenate) to make one longer table with a single column in Hive. I think this is possible using UNION ALL based on this question ( HiveQL UNION ALL ), but I'm not sure…
baha-kev
  • 3,029
  • 9
  • 33
  • 31
8
votes
3 answers

When to choose rank() over dense_rank() or row_number()

Since we can get the row number assigned using the row_number() and if we want to find the rank of each row without skipping of any number within the partition using dense_rank(), why do we need rank() function, I can't think of any use case that…
Ankur
  • 893
  • 3
  • 12
  • 29
8
votes
1 answer

Hive- how do I "create table as select.." with partitions from original table?

I need to create a "work table" from our hive dlk. While I can use: create table my_table as select * from dlk.big_table just fine, I have problem with carrying over partitions (attributes day, month and year) from original "big_table" or just…
MageInTraining
  • 83
  • 1
  • 1
  • 4