Questions tagged [sqoop]

Sqoop is an open source connectivity framework that facilitates transfer between multiple Relational Database Management Systems (RDBMS) and HDFS. Sqoop uses MapReduce programs to import and export data; the imports and exports are performed in parallel.

Sqoop is an open source connectivity framework that facilitates transfer between multiple Relational Database Management Systems (RDBMS) and HDFS. Sqoop uses MapReduce programs to import and export data; the imports and exports are performed in parallel.

You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back into an RDBMS.

Available Sqoop commands:

  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import mainframe datasets to HDFS
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  version            Display version information

Sqoop has been a Top-Level Apache project since March of 2012.

References

Related Tags

2610 questions
4
votes
1 answer

Java - com.cloudera.sqoop vs. org.apache.sqoop which to import from sqoop jar?

I am confused While import library (com.cloudera.sqoop and org.apache.sqoop) and get this in eclipse (jar included sqoop-1.4.4-hadoop200.jar ) - The method run(com.cloudera.sqoop.SqoopOptions) in the type ImportTool is not applicable for the…
4
votes
1 answer

Difference between Avrodata file and Sequence file with respect to Apache sqoop

In sqoop's perspective what is the difference between importing a relational table as a sequence file like- sqoop import --connect connectionString \ --username userName –P --table tableName \ --as-sequencefile and importing it as a avrodata…
SparkOn
  • 8,806
  • 4
  • 29
  • 34
4
votes
3 answers

sqoop importing mysql issue

I am 4 days old with Hadoop, I am trying to import a table from my local database mysql, to learbn sqoop, my machine is ubuntu 13.04, my sqoop version: 1.4.3-cdh4.7.0, mysql:5.5.34 this is the command I use in the prompt: sqoop import --connect…
arpho
  • 1,576
  • 10
  • 37
  • 57
4
votes
1 answer

sqoop and password encryption using password-file option

I'm using sqoop-1.4.3-cdh4.6.0.jar and i'm wondering if the --password-file option is avialable in that version. If yes, can someone give me an example of how the encryption process would be invoked? provide a command example , i can see that the…
Roman Cwalina
  • 61
  • 2
  • 5
4
votes
0 answers

sqoop hangs when it import from sql server to hdfs

Am trying to import data from Sql server, Sqoop command is hanging after the below message. I have tested connectivity am able to do list database, tables and even select statement but it fails when it writes/ imports data in to hdfs. Could you…
user3350280
  • 95
  • 1
  • 8
4
votes
1 answer

HOW can i import table from mysql to hbase?

use testhadoop; CREATE TABLE employee( empid INT(2), empname varchar(20), salray int (6) ); INSERT INTO employee VALUES (1,'emp1',15000), (1,'emp1',15000), (2,'emp2',12200), (3,'emp3',99999), (4,'emp4',17687), …
sivaramanjaneyulu
  • 647
  • 3
  • 11
  • 19
4
votes
0 answers

how to get job counters with sqoop 1.4.4 java api?

I'm using Sqoop 1.4.4 and its java api to run an import job and I'm having trouble figuring out how to access the job counters once the import has completed. I see suitable methods in the ConfigurationHelper class, like getNumMapOutputRecords, but…
mdigan
  • 41
  • 2
4
votes
2 answers

Running shell script through oozie

I'm trying to execute a shell script through oozie but I'm having some issues. I have a property file like this (import.properties): startIndex=2000 chunkSize=2000 The idea is, in every single execution the startIndex value will be updated by the…
dreamer
  • 1,039
  • 2
  • 16
  • 36
4
votes
1 answer

Sqoop is importing an integer as a string

I am trying to use sqoop in a lookup from Microsoft SQL Server. Here is my sqoop script: sqoop import \ --connect 'jdbc:sqlserver://LOOKUPDB-INT;database=Lookup_INT' \ --query "SELECT a.xlate_id, a.foreign_id as XlateKey, CAST(a.main_id as int) as…
4
votes
3 answers

It seems as though you are running sqoop with a JRE - But JAVA_HOME set to JDK

I tried to set up sqoop (sqoop-1.4.3.bin__hadoop-1.0.0) on Ubuntu. I can run the basic sqoop help etc without problems. When I run the following I get an error: sqoop import --connect jdbc:mysql://localhost/test --table sales -m 1 13/04/19 10:35:24…
Diddy
  • 193
  • 1
  • 3
  • 11
4
votes
1 answer

Sqoop incremental import to S3 Wrong FS error

When using the --incremental append flag in the sqoop import, the job will fail. ERROR tool.ImportTool: Imported Failed: Wrong FS: s3n://:@bucket/folder/ Here is the full command: sqoop import --connect…
ender
  • 245
  • 4
  • 12
4
votes
2 answers

How to import large mysql dumps into hadoop?

I need to import wikipedia dumps(mysql tables, unpacked files take about 50gb) into Hadoop(hbase). Now first I load dump into mysql and then transfer data from mysql to hadoop. But loading data into mysql takes huge amount of time - about 4-7 days.…
hudvin
  • 63
  • 1
  • 7
4
votes
1 answer

Import data on HDFS to SQL Server or export data on HDFS to SQL Server

I had been trying to figure out on which is the best approach for porting data from HDFS to SQL Server. Do I import data from Cloudera Hadoop using sqoop Hadoop Connector for SQL Server 2008 R2 or Do I export data from Cloudera Hadoop using sqoop…
3
votes
1 answer

Dataproc: SSL certificate not found for Sqoop job connecting to external PostgreSQL

I need to connect to PostgreSQL db via SSL. I received 2 certificates and 1 key -> sslrootcert=root.crt sslcert=postgresql.crt and sslkey=postgresql.key.der Here is my import config from…
vamper1234
  • 104
  • 8
3
votes
0 answers

ERROR tool.BaseSqoopTool: Error parsing arguments for import: when importing Mysql data into S3

I am trying to import mysql data into s3 bucket using the following command through sqoop sqoop job --create myJob -- import -Dmapreduce.job.user.classpath.first=true -Dfs.s3a.access.key= -Dfs.s3a.secret.key= --connect…
tharindu
  • 513
  • 6
  • 26