Questions tagged [sqoop2]

SQOOP2 is the 2nd generation version of SQOOP. DO NOT USE this tag for SQOOP questions. Use the sqoop tag instead. SQOOP2 was designed to fix the tight coupling that SQOOP has with Hadoop. But as of Nov 2017 is no longer actively maintained.

From Upgrading Sqoop 2 from an Earlier CDH 5 Release

Note: Sqoop 2 is being deprecated. Cloudera recommends using Sqoop 1.

References

176 questions
2
votes
0 answers

Sqoop: Understanding how num-mappers and fetch-size work together

I am trying to import a table from MySQL incrementally using following configuration: --split-by date_format(updated_at, '%l') --boundary-query select 1, 12 from…
pratpor
  • 1,954
  • 1
  • 27
  • 46
2
votes
1 answer

Sqoop import validation

could anyone please help me understand after importing the data from source system(Postgres , oracle,sqlserver) to hdfs using sqoop. What's are the checks you perform to see if the all the data is imported correctly without any discrepancies . How…
sat
  • 21
  • 1
2
votes
3 answers

SQOOP incremental import: how it handles the data when a row is deleted from the database?

Suppose I have an employee table with column ( emp_id, emp_name, emp_age , emp_update_ts ), updat_ts field is auto updated to current timestamp every time if there is an update on the table. now my question is : When I update/insert the row in the…
Aditya Agarwal
  • 693
  • 1
  • 10
  • 17
2
votes
1 answer

How can we automate the incremental import in Sqoop from DB to HBase using linux script

Using sqoop job we can do the incremental load to HBase using the --lastval But how we can do the same with shell script and how we will get the --lastval when we automate the script ? I mean how to store the --lastval and how to pass it to the next…
Raj
  • 537
  • 4
  • 9
  • 18
2
votes
1 answer

How to mask data while ingesting data using sqoop

I am extracting data using sqoop. Is there any way to mask any particular column in sqoop or modify each cell. For example: creditcardinfo 7888-3333-2222-1002 1111-2342-1235-2090 2331-2131-2222-3421 I want data to be like after ingestion: …
Alka
  • 267
  • 1
  • 9
2
votes
1 answer

Extracting Data from Oracle to Hadoop. Is Sqoop a good idea

I'm looking to extract some data from an Oracle database and transferring it to a remote HDFS file system. There appears to be a couple of possible ways of achieving this: Use Sqoop. This tool will extract the data, copy it across the network and…
Stormcloud
  • 2,065
  • 2
  • 21
  • 41
2
votes
1 answer

Exception: Job Failed with status:3 when copying data from Oracle to HDFS through sqoop2

I am trying to use Sqoop2 to copy data from an Oracle 11g2 server to HDFS. The link to Oracle seems to work, as it will complain if I use invalid credentials. The definition is as follows: link with id 14 and name OLink (Enabled: true, Created…
A Hocevar
  • 726
  • 3
  • 17
2
votes
1 answer

Sqoop 2 restart with no more job

Today I restarted mysqoop server and now all my jobs and links seems to have gone away. Sqoop is working with a derby database : org.apache.sqoop.repository.jdbc.url=jdbc:derby:@BASEDIR@/repository/db;create=true Do you have any clue on how i can…
jBravo
  • 873
  • 1
  • 9
  • 28
2
votes
1 answer

can hadoop and Sqoop run separately on different machines

I have installed sqoop 1.4.6 on new node and hadoop is running on different node. Can i point my sqoop server to use existing hadoop environment ? I know there is some argument "--hadoop-mapred-home" to set hadoop path but this is used within same…
code1234
  • 121
  • 1
  • 2
  • 11
2
votes
2 answers

SQOOP export in shell script fails

I am exporting a table from hive to mysql with the help of shell script.The below is the sqoop export command sqoop export --connect jdbc:mysql://192.168.154.129:3306/ey -username root --table call_detail_records --export-dir…
prasannads
  • 609
  • 2
  • 14
  • 28
2
votes
1 answer

How I can specify sqoop export columns in target db?

I would like to populate a postgres table from an avro file using sqoop (2) export, but i dont have id field in the source and that should be populated automatically (serial type) but i am getting an error. table DDL: CREATE TABLE test ( id serial…
clairvoyant
  • 129
  • 1
  • 14
2
votes
3 answers

Date field issues while using Sqoop with --as-avrodatafile option

Following is the gist of my problem. Env: Hadoop 2 (CDH5.1) database: oracle 11g Scenarios: I'm sqooping fact and dimension tables from the database into hdfs. Initially, I had challenges in handling nulls (which was handled using --null-string…
venBigData
  • 600
  • 1
  • 8
  • 23
2
votes
1 answer

Override multiple sqoop properties while executing sqoop job

I am finding that when overriding sqoop job properties at runtime, I am able to override only one property. example 1: if i submit sqoop job --exec test123 -- --query "select * from test where update_batch_id between 4 and 10 and \$CONDITIONS" --…
2
votes
2 answers

Sqoop incremental import (db schema incorrect)

Im trying to do an incremental import using the Sanbox 2.1 and a Microsoft SQL Server (AdventureWorks database). For the incremental import i’m using the following command: sqoop import --connect…
Bas
  • 597
  • 5
  • 10
  • 22
2
votes
2 answers

How to get Sqoop Server Running URL and Port in Horton Sandbox

I am using sqoop client. And don't know by which URL I have to initialize the SqoopClient Object. I am running horton Sandbox which is preconfigured with everything. I dont know it is having sqoop server running or not. And If it is running then I…
1
2
3
11 12