Questions tagged [hive-query]

82 questions
1
vote
1 answer

Insert data in hive using multidelimeter

how to insert data in hive using multidelimeter and between the column the delimiter is not specified. Below is my data : 25380 20130101 2.514 -135.69 58.43 8.3 1.1 4.7 4.9 5.6 0.01 C 1.0 -0.1 0.4 97.3 …
shael
  • 177
  • 9
1
vote
2 answers

Populate preceding value using case statement in Hive

I have a column event in Hive table like below. Event Sent Sent Open Open Click Sent Open Signup Sent Open Click Now I want to create new column based on the values in event column using case statement. I want to where there is signup in event…
nmr
  • 605
  • 6
  • 20
1
vote
0 answers

Pivot rows to columns using hive query dynamically

I have implemented following query to pivot col_nm from row to column: select grp_id,orig_businesseffectivedate, max(case when col_nm= 'bus_seg_cd' then col_val end) as bus_seg_cd, max(case when col_nm= 'dft_insur_cd' then col_val end) as…
nimcurry
  • 11
  • 1
  • 4
1
vote
2 answers

Create range bins in hive for histograms

I have a data set which contains students_id and their ages. I want the marks should be arranged in a range or bin with the bucket size of 10. stud_id ages 101 11 102 13 103 21 104 25 Similarly i have date for more…
Sara
  • 312
  • 6
  • 15
1
vote
1 answer

Run query in beeline from file

I want to run query stored file in beeline. This code works OK in putty. beeline -u "hiveserver" -n "username" -p "password" --outputformat=csv2 --silent=true -e "select * from table;" >output1.txt When I save sql command to query.hql or query.sql…
ALdo
  • 75
  • 2
  • 13
1
vote
2 answers

Calculating consecutive range of dates with a value in Hive

I want to know if it is possible to calculate the consecutive ranges of a specific value for a group of Id's and return the calculated value(s) of each one. Given the following data: +----+----------+--------+ | ID | DATE_KEY | CREDIT…
1
vote
0 answers

Hive complex types

Please help me in making understand the difference between collect_set (named_struct) and array ( named_struct) while inserting data into datatype of array < struct > in a table. Any difference between the 2 options ?
kmreddy
  • 11
  • 2
1
vote
1 answer

Will data get deleted on dropping internal table using location clause during its creation from hive?

In hive if I create an internal table using the loaction clause (mentioning loaction other than default location of hive) in table creation statement then on dropping that table will it delete the data from the specified location just like it does…
1
vote
0 answers

Hive Add partition to external table slow

So I need to create a external table for some data stored on S3 and add partitions explicitly (unfortunately, the directory hierarchy does not fit the dynamic partition functionality due to the name mismatch) for example: add partition for…
seiya
  • 1,477
  • 3
  • 17
  • 26
1
vote
1 answer

Select statement in hive return some columns with null value

I have seen this type of questions were asked many times, but those solutions not worked for me. I created a external hive table, since i had the data is from map-only job output. Then, by load command i given the path for the specific file. It…
0
votes
0 answers

Hive sql left join not bringing back all rows from left table when there is a where clause

I'm trying to retrieve all rows from left table using a left join but only matched records are returned when adding a where clause. When there is no where clause then all rows from items table is returned, with the following script: select * from…
user15676
  • 123
  • 2
  • 10
0
votes
0 answers

How can I kill the query by looking query explain plan and costs or can i look memory usage to kill query?

I use hive for sql to get data from my hadoop. How can I kill the query by looking query explain plan and costs or can i look memory usage to kill query? I use this to look at query : explain select * from default.my_table where my_query like…
CompEng
  • 7,161
  • 16
  • 68
  • 122
0
votes
1 answer

Apache Hive command

I have this question: Show the top 5 game Disciplines for the countries who got more than 10 gold medals. my code is: select distinct t.discipline, m.team from teams t join medals m on (t.noc=m.team and m.numbergold>10) order by m.team; cloud…
head
  • 1
  • 1
0
votes
1 answer

Is there a Hiveql function using which we can pull records from a table where a JSON type column has a specific value for a key?

I'm looking to get count of records in which a column(type) of json type has certain key:value in table named product_type. _______________________________________________________ id | product | type | 1 |…
zeva_u
  • 1
  • 2
0
votes
0 answers

HIVE - How to drop columns with more than 50% missing values in hive

I have a dataset of columns maker, model, mileage, manufacture_year, engine_displacement, engine_power, body_type, color_slug, skt_year, transmission, door_count, seat_count, fuel_type, date_created, date_last_seen, price_eur And I need to drop the…
Wfee
  • 71
  • 1
  • 6