I have my hive tables stored as Parquet format in a location in HDFS. Can I convert the parquet files in this location to Sequence file format and build hive tables over it? Is there any procedure to do this conversion?
Asked
Active
Viewed 842 times
0
-
Why?............ – David דודו Markovitz Mar 27 '17 at 17:47
-
@DuduMarkovitz some other team in my company want the data as sequence file format. – Andy Reddy Mar 27 '17 at 18:42
2 Answers
1
Create new sequence file table and reload data using insert select:
insert into sequence_table
select * from parquet_table;

leftjoin
- 36,950
- 8
- 57
- 116
-
-
if my sequence table is partitioned by year, month, day then how can I insert all the records from my parquet table which is partitioned by year, month, day as it is into my sequence table? – Andy Reddy Mar 29 '17 at 04:14
-
create partitioned table, `insert overwrite table sequence_table partition (year, month, day) select from parquet table`, partitions keys should be last, add distribute by partition keys at the end to reduce pressure on reducers. If the target table has exactly the same structure you can select *. – leftjoin Mar 29 '17 at 06:54
1
hive> create table src (i int) stored as parquet;
OK
Time taken: 0.427 seconds
hive> create table trg stored as sequencefile as select * from src;
For @AndyReddy
create table src (i int)
partitioned by (year int,month tinyint,day tinyint)
stored as parquet
;
create table trg (i int)
partitioned by (year int,month tinyint,day tinyint)
stored as sequencefile
;
set hive.exec.dynamic.partition.mode=nonstrict
;
insert into trg partition(year,month,day)
select * from src
;

David דודו Markovitz
- 42,900
- 6
- 64
- 88
-
if my sequence table is partitioned by year, month, day then how can I insert all the records from my parquet table which is partitioned by year, month, day as it is into my sequence table? Just do insert into? – Andy Reddy Mar 29 '17 at 04:35