0

I want to copy data from HDFS to hive table. I tried below code but it doesn't throw any error and data is also not copied in mentioned hive table. Below is my code:

sqoop import --connect jdbc:mysql://localhost/sampleOne \
--username root \
--password root \
--external-table-dir "/WithFields" \
--hive-import \
--hive-table "sampleone.customers"   

where sampleone is database in hive and customers is newly created table in hive and --external-table-dir is the HDFS path from where I want to load data in hive table. What else I am missing in this above code ??

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
RushHour
  • 494
  • 6
  • 25
  • 1
    You do not need Sqoop to copy data from HDFS to Hive. Just create Hive table with your HDFS location or copy data to existing Hive table location using `hadoop fs -scp` command – leftjoin Apr 29 '18 at 08:06
  • Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See [What topics can I ask about here](http://stackoverflow.com/help/on-topic) in the Help Center. Perhaps [Super User](http://superuser.com/) or [Unix & Linux Stack Exchange](http://unix.stackexchange.com/) would be a better place to ask. – jww May 02 '18 at 04:39

1 Answers1

0

If data is in HDFS, you do not need Sqoop to populate a Hive table. Steps to do this are below:

This is the data in HDFS

# hadoop fs -ls /example_hive/country
/example_hive/country/country1.csv

# hadoop fs -cat /example_hive/country/*
1,USA
2,Canada
3,USA
4,Brazil
5,Brazil
6,USA
7,Canada

This is the Hive table creation DDL

CREATE TABLE sampleone.customers
(
  id int, 
  country string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';

Verify Hive table is empty

hive (sampleone)> select * from sampleone.customers;
<no rows>

Load Hive table

hive (sampleone)> LOAD DATA INPATH '/example_hive/country' INTO TABLE sampleone.customers;

Verify Hive table has data

hive (sampleone)> select * from sampleone.customers;
1   USA
2   Canada
3   USA
4   Brazil
5   Brazil
6   USA
7   Canada

Note: This approach will move data from /example_hive/country location on HDFS to Hive warehouse directory (which will again be on HDFS) backing the table.

Jagrut Sharma
  • 4,574
  • 3
  • 14
  • 19
  • yeah I am aware of this . I thought is there any way via sqoop to transfer data that's why I tried . Anyways, thanks. – RushHour Apr 29 '18 at 14:05
  • @Debugger Sqoop is not used between HDFS and Hive. It's used between a JDBC database such as Mysql and HDFS, optionally creating or exporting a Hive table. – OneCricketeer Apr 29 '18 at 17:26
  • @Jagrut You could also skip the load step and just make an external table over the folder with the data file – OneCricketeer Apr 29 '18 at 17:32
  • @cricket_007 Correct, that is another way - An external Hive table could be created pointing to the existing HDFS directory. The question suggested that the Hive table sampleone.customers was already available. Hence, I suggested the approach of running a command to just populate it. – Jagrut Sharma Apr 29 '18 at 18:57