Sqoop is working only with 1 map task ( - m 1) not more ,

Question

My Sqoop import is working only with 1 map task ( - m 1) not more.

This is working:

sqoop import --connect jdbc:mysql://localhost/databaseY --username root --password PASSWORD --table tableX --target-dir /tmp/databaseY --as-textfile -m 1

This not:

sqoop import --connect jdbc:mysql://localhost/databaseY --username root --password PASSWORD --table tableX --target-dir /tmp/databaseY --as-textfile -m 3

My cluster is a 3 nodes on AWS.

I missed something during the configuration?

----EDIT FOR THE SOLUTION ---- The problem was the localhost. I changed it by the IP address and it is working fine.

Dev: It is modified. The -m at the end is not the same now. BigDataLearner: The source table should be OK as it is an official test exemple table. — Selverine, Aug 21 '16 at 08:28

score 1 · Answer 1 · edited May 23 '17 at 11:51

As sqoop docs are sufficient to put some light on this,

When performing parallel imports, Sqoop needs a criterion by which it can split the workload. Sqoop uses a splitting column to split the workload. By default, Sqoop will identify the primary key column (if present) in a table and use it as the splitting column. The low and high values for the splitting column are retrieved from the database, and the map tasks operate on evenly-sized components of the total range. For example, if you had a table with a primary key column of id whose minimum value was 0 and maximum value was 1000, and Sqoop was directed to use 4 tasks, Sqoop would run four processes which each execute SQL statements of the form SELECT * FROM sometable WHERE id >= lo AND id < hi, with (lo, hi) set to (0, 250), (250, 500), (500, 750), and (750, 1001) in the different tasks.

If a table does not have a primary key defined and the --split-by <col>is not provided, then import will fail unless the number of mappers is explicitly set to one with the --num-mappers 1 option.

(Emphasis is mine)

Edit: My previous answer on a related topic will also help you on this.

Sqoop is working only with 1 map task ( - m 1) not more ,

1 Answers1