15

Suppose my table looks something like:

Col1 Col2 Col3.....Col20 Col21

Now I want to select all but Col21. I want to change it to unix_timestamp() before I insert into some other table. So the trivial approach is to do something like:

INSERT INTO newtable partition(Col21) 
SELECT Col1, Col2, Col3.....Col20, unix_timestamp() AS Col21
FROM oldTable

Is there a way I can achieve this in hive? Thanks a lot for your help!

Rocking chief
  • 1,039
  • 3
  • 17
  • 31

3 Answers3

25

Try to setup the below property

set hive.support.quoted.identifiers=none;

Then select all columns except col_21:

select `(col_21)?+.+` from <table_name>; 

For more info refer to this link.

Then insert statement will be

insert into <tablename> partition (col21) 
select `(col_21)?+.+` from ( --select all columns from subquery except col21
select *, unix_timestamp() AS alias_col21 from table_name --select *, create new col based on col21
)a;

By using this approach you are going to have alias_col21 as last column in your select statement so that you can partition based on that column.

In Case of joins:

We cannot refer individual columns((t1.id)?+.+..etc) from each table, so drop the unnecessary columns in select statement.

hive>insert into <tablename> partition (col21)
select * from (
       select t1.* from
         (--drop col21 and create new alias_col21 by using col21
          select `(col21)?+.+`, unix_timestamp() AS alias_col21 from table1
         ) t1 
    join table2 t2 
  on t1.<col-name>=t2.<col-name>)a;
notNull
  • 30,258
  • 4
  • 35
  • 50
  • Thanks! Do you know how to do this if I am using alias? For example, SELECT table1.* except Col21 FROM table1 join table2 on some condition. I don't want to select anything in table2 though. So I need to exclude table2 and Col21. Thanks! – Rocking chief Jul 08 '18 at 02:05
  • Sure..!! please see my edit **in case of joins** section in the original answer – notNull Jul 08 '18 at 03:30
  • Can we get the column in sample place, after replace. like col_1,col_2,col_3 i want to change col_2 and final out put of my select should be col_1,col_2,col_3 , instead of col_1,col_3,col_2 – sande May 16 '19 at 15:25
  • 1
    @sande,i don't think there is a way to get columns in same place, but you can define an `hive variable` with columns and use that **variable in your select** query. – notNull May 17 '19 at 13:19
3

In case you want to drop multiple columns on which you are joining

select
    tb1.*,
    tb2.`(col1|col2)?+.+`
from
     tb1 left join tb2 on
    tb1.col1 = tb2.col1
    and tb1.col2 = tb2.col2
0

Majority of us using wrong special characters due to that it may not be working. What we should use is backtick character(`) and not single quote or any other characters.

select `(name_of_col_to_be_ignored)?+.+`  from table_name;

Note : Alternatively known as acute, backtick, left quote, or an open quote, the back quote or backquote is a punctuation mark (`). It's on the same U.S. computer keyboard key as the tilde