Questions tagged [sqoop]

Sqoop is an open source connectivity framework that facilitates transfer between multiple Relational Database Management Systems (RDBMS) and HDFS. Sqoop uses MapReduce programs to import and export data; the imports and exports are performed in parallel.

Sqoop is an open source connectivity framework that facilitates transfer between multiple Relational Database Management Systems (RDBMS) and HDFS. Sqoop uses MapReduce programs to import and export data; the imports and exports are performed in parallel.

You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back into an RDBMS.

Available Sqoop commands:

  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import mainframe datasets to HDFS
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  version            Display version information

Sqoop has been a Top-Level Apache project since March of 2012.

References

Related Tags

2610 questions
5
votes
5 answers

What is the difference between --split-by and --boundary-query in SQOOP?

Assuming we don't have a column where values are equally distributed, let's say we have a command like this: sqoop import \ ... --boundary-query "SELECT min(id), max(id) from some_table" --split-by id ... What's the point of using --boundary-query…
burakongun
  • 241
  • 4
  • 6
  • 17
5
votes
4 answers

Sqoop import Null string

The Null values are displayed as '\N' when a hive external table is queried. Below is the sqoop import script: sqoop import -libjars /usr/lib/sqoop/lib/tdgssconfig.jar,/usr/lib/sqoop/lib/terajdbc4.jar -Dmapred.job.queue.name=xxxxxx \ --connect…
Bagavathi
  • 438
  • 2
  • 7
  • 17
5
votes
3 answers

Incremental data load using sqoop without primary key or timestamp

I have a table that doesn't have any primary key and datemodified/timestamp. This table is just like a transaction table that keeps saving all data (No delete/update). My problem now is I want to inject the data to HDFS without loading the whole…
MMakati
  • 693
  • 1
  • 15
  • 33
5
votes
1 answer

Can we split Sqoop job by multiple column combination

I am using below Sqoop syntax to split Sqoop job by single column[mostly primary key]. sqoop import --connect jdbc:oracle:thin:@//oracle_server:1521/sid --username xxx --password xxx --table EMPLOYEE --split-by ID -m 10 Can we use multiple…
akshat thakar
  • 1,445
  • 21
  • 29
5
votes
2 answers

When to use Sqoop --create-hive-table

Can anyone tell the difference between create-hive-table & hive-import method? Both will create a hive table, but still what is the significance of each?
Priya v v
  • 143
  • 2
  • 3
  • 9
5
votes
4 answers

Sqoop import as OrC file

Is there any option in sqoop to import data from RDMS and store it as ORC file format in HDFS? Alternatives tried: imported as text format and used a temp table to read input as text file and write to hdfs as orc in hive
5
votes
9 answers

sqoop import issue with mysql

I have a hadoop ha setup based on cdh5.I have tried to import tables from mysql by using sqoop failed with following error. 15/03/20 12:47:53 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set…
Bobin Jose
  • 75
  • 1
  • 1
  • 6
5
votes
2 answers

Export HDFS file with custom delimiter into Mysql via Sqoop

I have file like this: 1^%~binod^*~1^%~ritesh^*~1^%~shisir^*~1^%~budhdha^*~1^%~romika^*~1^%~rubeena^*~ Where --input-fields-terminated-by '^%~' --input-lines-terminated-by '^*~'. I tried to export via command: sqoop export --connect…
RaiBnod
  • 2,141
  • 2
  • 19
  • 25
5
votes
3 answers

hive-drop-import-delims not removing newline while using HCatalog in Sqoop

Sqoop while used with HCatalog import not able to remove new line (\n) from column data even after using --hive-drop-import-delims option in the command when running Apache Sqoop with Oracle. Sqoop Query: sqoop import --connect…
Suraj Nayak
  • 907
  • 1
  • 8
  • 24
5
votes
3 answers

How to change sqoop metastore?

I am using sqoop 1.4.2 version. I am trying to change the sqoop metastore from default hsqldb to mysql. I have configured following properties in sqoop-site.xml file. sqoop.metastore.client.enable.autoconnect
Chetan Shirke
  • 896
  • 4
  • 13
  • 35
5
votes
7 answers

Sqoop: Could not load mysql driver exception

I Installed Sqoop in my local machine. Following are the config information. Bash.bashrc: export HADOOP_HOME=/home/hduser/hadoop export HBASE_HOME=/home/hduser/hbase export HIVE_HOME=/home/hduser/hive export…
Sam
  • 2,545
  • 8
  • 38
  • 59
5
votes
1 answer

Getting NoServerForRegionException: Unable to find region when attempting to import from MySQL into HBase

I'm having problems importing a table from MySQL into HBase using Sqoop. I'm working in a cluster with 3 nodes (1 master, 2 slaves). When I tried to run this command: sqoop import --hbase-create-table --hbase-table (any_tablename) …
Florencia
  • 51
  • 1
  • 5
5
votes
3 answers

How to use a specified Hive database when using Sqoop import

sqoop import --connect jdbc:mysql://remote-ip/db --username xxx --password xxx --table tb --hive-import The above command imports table tb into the 'default' Hive database. Can I use other database instead?
JustFF
  • 115
  • 1
  • 1
  • 6
5
votes
1 answer

How to use the sqoop generated class in MapReduce?

A sqoop query generates a java file that contains a class that contains the code to get access in mapreduce to the columns data for each row. (the Sqoop import was done in text without the --as-sequencefile option, and with 1 line per record and…
bill ou
  • 217
  • 4
  • 13
5
votes
3 answers

How to use autoincrement-IDs in Sqoop export

I have a tab-separated textfile in HDFS, and want to export this into a MySQL table. Since the rows in the textfile do not have numerical ids, how do I export into a table with an ID automatically set during the SQL INSERT (autoincrement)? If I try…
thomers
  • 2,603
  • 4
  • 29
  • 50