9

I have a Hive Query like

SELECT Year, Month, Day, Hours, Minutes,
           cast((cast(Seconds as int)/15) as int)*15
AS secondMod, Count(*) AS PerCount FROM LoggerTable 
 GROUP BY Year, Month, Day, Hours, Minutes, secondMod 
ORDER BY PerCount;

the above query fails with an error

FAILED: Error in semantic analysis: line 1:175 Invalid Table Alias or Column Reference secondMod

'LoggerTable' is a Hive Table with all columns of string type.

Any workaround for this issue?

Suvarna Pattayil
  • 5,136
  • 5
  • 32
  • 59
Srinivas
  • 2,479
  • 8
  • 47
  • 69

2 Answers2

12

Try this:

SELECT Year, Month, Day, Hours, Minutes, 
cast((cast(Seconds as int)/15) as int)*15 
AS secondMod, Count(*) AS PerCount FROM LoggerTable 
 GROUP BY Year, Month, Day, Hours, Minutes, 
   cast((cast(Seconds as int)/15) as int)*15
ORDER BY PerCount;
Suvarna Pattayil
  • 5,136
  • 5
  • 32
  • 59
Alexey Bychkov
  • 581
  • 5
  • 9
  • This is a huge pain if you want to define a very complicated `SELECT` and want to use it in your `WHERE`, `GROUP BY` and `ORDER BY`. There is somewhat of a workaround here http://stackoverflow.com/questions/26028767/why-cant-hive-recognize-alias-named-in-select-part – ubershmekel May 09 '16 at 21:50
4

In Hive 0.11.0 and later, columns can be specified by position if hive.groupby.orderby.position.alias is set to true. Please confirm if the following query works for you.

SET hive.groupby.orderby.position.alias=true;
SELECT Year
       ,Month
       ,Day
       ,Hours
       ,Minutes
       ,cast((cast(Seconds as int)/15) as int)*15 AS secondMod
       ,count(*) AS PerCount 
FROM LoggerTable 
GROUP BY 1, 2, 3, 4, 5, 6
ORDER BY 7;
rafaelvalle
  • 6,683
  • 3
  • 34
  • 36
  • I tried this, but it would work only for the group by. The query fails when I try order by. Do you have any thoughts why it could happen? My query looks like this: `SET hive.groupby.orderby.position.alias=true; SELECT device ,COUNT ( DISTINCT from_user ) AS users_sharing FROM . GROUP BY 1 order by 2 desc limit 10` And I got the following error: `FAILED: SemanticException [Error 10004]: Line 4:21 Invalid table alias or column reference 'from_user': (possible column names are: device, users_sharing)`
    – Dileep Kumar Patchigolla Nov 10 '16 at 11:31
  • Sorry for the format. I couldn't figure out how to post the query with the indentation preserved. – Dileep Kumar Patchigolla Nov 10 '16 at 11:33
  • Confirm that your `from_user` table alias or column reference exists. Maybe different spelling, typo. A possible solution is to break down your query into two parts: `hive.groupby.orderby.position.alias=true; SELECT res.device, res.users_sharing FROM (SELECT device, COUNT ( DISTINCT from_user ) AS users_sharing FROM . GROUP BY 1 limit 10; ) AS res ORDER BY res.users_sharing` Let us know how you solve your problem!
    – rafaelvalle Nov 10 '16 at 18:20
  • There is no typo in the col name. I tried your suggestion of separating the group by in a subquery and worked!! So there is a workaround, but I am wondering why the initial query wouldn't work with both group by and order by side-by-side. – Dileep Kumar Patchigolla Nov 11 '16 at 06:38
  • Can you try your original query with `order by 3` and let us know what the results are? – rafaelvalle Nov 11 '16 at 18:59
  • It has failed with the error saying position 3 doesn't exist: `FAILED: SemanticException [Error 10221]: Invalid position alias in Order By Position alias: 3 does not exist The Select List is indexed from 1 to 2` Is it actually possible in hive to use both group by and order by with positions in the same subquery? – Dileep Kumar Patchigolla Nov 14 '16 at 05:59
  • 1
    It seems that Hive does not allow aggregation function in ORDER BY. You could try using your original query substituting `ORDER BY 2` with `ORDER BY users_sharing` – rafaelvalle Nov 14 '16 at 06:39