size of array with conditional argument in hive

Question

I have a dataset that has a column with an array of timestamps and a column with just one timestamp. I'm looking to get the size of the array using the c1 timestamp as a condition for larger and smaller.

Table (my_table):

c1 |           c2           |
----------------------------|
4  | [1,2,3,4,5,6,7,8,9,10] |
1  | [1,2,3,4,5,6,7,8,9,10] |
5  | [1,2,3,4,5,6,7,8,9,10] |
3  | [1,2,3,4,5,6,7,8,9,10] |

Query:

select
c1,
c2,
size(some_udf_split_on_c1(sort_array(<array>), c1)[1]) AS smaller_than_c1
size(some_udf_split_on_c1(sort_array(<array>), c1)[2]) AS larger_than_c1

from my_table

The udf's there are my hypothesized implementation.

Output:

c1 |           c2           | smaller_than_c1 | larger_than_c1
----------------------------|-----------------|---------------
4  | [1,2,3,4,5,6,7,8,9,10] |        3        |      6
1  | [1,2,3,4,5,6,7,8,9,10] |        0        |      9
5  | [1,2,3,4,5,6,7,8,9,10] |        4        |      5
3  | [1,2,3,4,5,6,7,8,9,10] |        1        |      8

I have no code for that functionality I'm looking for help. Do you really need me to write the simple select statement? — moku, Jul 19 '15 at 15:30
I was asking in a very traditional stack overflow way how one might go about accomplishing that task. If it requires a custom udf all one would have to do is say "that would require coding a custom udf". If there are existing udfs that could accomplish the task simply pointing me to those udfs would be helpful. Isn't stack overflow about helping others by sharing information and expertise on a subject. @GoBrewers14 not sure how your contributing constructively atleast GaneshThiagarajan asked for something to help improve the post. — moku, Jul 20 '15 at 04:10

score 0 · Answer 1 · answered Jul 21 '15 at 18:48

0

A UDF will be the quickest path to what you're looking for. It would probably be possible to do this with a combination of explode and group bys, but compared to 10 lines of UDF code, you'll be happier just writing the UDF.

answered Jul 21 '15 at 18:48

Jakob Homan

2,284
1
13
16

size of array with conditional argument in hive

1 Answers1