30

I have a hive table stored as a sequencefile.

I need to load a text file into this table. How do I load the data into this table?

WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
cldo
  • 1,735
  • 6
  • 21
  • 26

3 Answers3

55

You can load the text file into a textfile Hive table and then insert the data from this table into your sequencefile.

Start with a tab delimited file:

% cat /tmp/input.txt
a       b
a2      b2

create a sequence file

hive> create table test_sq(k string, v string) stored as sequencefile;

try to load; as expected, this will fail:

hive> load data local inpath '/tmp/input.txt' into table test_sq;

But with this table:

hive> create table test_t(k string, v string) row format delimited fields terminated by '\t' stored as textfile;

The load works just fine:

hive> load data local inpath '/tmp/input.txt' into table test_t;
OK
hive> select * from test_t;
OK
a       b
a2      b2

Now load into the sequence table from the text table:

insert into table test_sq select * from test_t;

Can also do load/insert with overwrite to replace all.

libjack
  • 6,403
  • 2
  • 28
  • 36
  • 2
    could that be done directly, can we load to sequence format table stuff from TSV file without intermediate saving to other table? – Bohdan Dec 07 '13 at 00:07
  • No it can't be done. The only easiest way to load data into sequence file is the above way. The same is reaffirmed by apche wiki https://cwiki.apache.org/confluence/display/Hive/CompressedStorage – Jai Prakash Sep 16 '14 at 04:24
2

You cannot directly create a table stored as a sequence file and insert text into it. You must do this:

  1. Create a table stored as text
  2. Insert the text file into the text table
  3. Do a CTAS to create the table stored as a sequence file.
  4. Drop the text table if desired

Example:

CREATE TABLE test_txt(field1 int, field2 string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

LOAD DATA INPATH '/path/to/file.tsv' INTO TABLE test_txt;

CREATE TABLE test STORED AS SEQUENCEFILE
AS SELECT * FROM test_txt;

DROP TABLE test_txt;
Aidan Feldman
  • 5,205
  • 36
  • 46
0

The simple way is to create table as textfile and move the file to the appropriate location

CREATE EXTERNAL TABLE mytable(col1 string, col2 string)
row format delimited fields terminated by '|' stored as textfile;

Copy the file to the HDFS Location where table is created.
Hope this helps!!!

MukeshKoshyM
  • 514
  • 1
  • 8
  • 16