0

I'm trying to import CloudSQL tables into GCS bucket using SQOOP. I've used the below jars..

kite-data-core-1.1.0.jar, kite-data-hive-1.1.0.jar, kite-data-mapreduce-1.1.0.jar, kite-hadoop-compatibility-1.1.0.jar.

below is my code snippet:

    ```sqoop import 
    -libjars=gs://BUCKET_NAME/kite-data-core-1.1.0.jar,gs://BUCKET_NAME/kite-data-mapreduce-1.1.0.jar,gs://BUCKET_NAME/kite-data-hive-1.1.0.jar,gs://BUCKET_NAME/kite-hadoop-compatibility-1.1.0.jar,gs://BUCKET_NAME/hadoop-mapreduce-client-core-3.2.0.jar 
    --connect=jdbc:mysql://IP/DB Name
     --username=sqoop_user 
    --password=sqoop_user 
    --target-dir=gs://BUCKET_NAME/mysql_output 
    --table persons 
    --split-by personid -m 2 
    --as-parquetfile```

I'm getting the below error...

20/01/03 04:42:29 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar Exception in thread "main" java.lang.NoClassDefFoundError: org/kitesdk/data/mapreduce/DatasetKeyOutputFormat at org.apache.sqoop.mapreduce.DataDrivenImportJob.getOutputFormatClass(DataDrivenImportJob.java:190) at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:94) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673) at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) Caused by: java.lang.ClassNotFoundException: org.kitesdk.data.mapreduce.DatasetKeyOutputFormat at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

In the first line of error, it says ' mapred.jar is deprecated. Instead, use mapreduce.job.jar'...

I've imported mapreduce.job.jar and passed it as libjar argument, but the issue still remains the same.

Help in this issue is much appreciated.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • This link might help. https://medium.com/google-cloud/moving-data-with-apache-sqoop-in-google-cloud-dataproc-4056b8fa2600 – marjun Jan 02 '20 at 10:27
  • You can see [here](https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/DeprecatedProperties.html) that mapred.jar is deprecated so instead of it, you've to import mapreduce.job.jar. – Nibrass H Jan 03 '20 at 08:34
  • Can you share how are you importing the mapreduce.job.jar? – Nibrass H Jan 03 '20 at 08:34
  • You can follow this [Apache Hadoop Official Documentation](https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html) and import all the necessary jars – Nibrass H Jan 03 '20 at 08:35
  • And in Stackoverflow, there is a similar [post](https://stackoverflow.com/questions/19436361/issue-with-org-apache-hadoop-mapreduce-imports-in-apache-hadoop-2-2) which can help you. – Nibrass H Jan 03 '20 at 08:36
  • I've dowloaded the latest version of mapreduce.job.jar from this link - https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-mapreduce-client-core/3.2.1 As i'm working with GCP, i'm uploading the jar into GCS Bucket and passing it as 'libjar' argument. – Chakkirala Chaitanya Jan 06 '20 at 09:01
  • I think mapreduce.job.jar is only an INFO message as it's saying in the first line, I think there are other errors like Exception in thread "main" java.lang.NoClassDefFoundError ... [Here](https://community.cloudera.com/t5/Support-Questions/Sqoop-import-exception-java-lang-NoClassDefFoundError-org/td-p/50758) you can find a solution to your issue. – Nibrass H Jan 06 '20 at 12:44
  • This [post](https://stackoverflow.com/questions/41405072/sqoop-integration-with-hadoop-throw-classnotfoundexception) will help you to solve your issue. Mapred.jar is deprecated was not an error if not an INFO message, java.lang.NoClassDefFoundError is the error in your case. – Nibrass H Jan 06 '20 at 12:52
  • If the previous posts don't help you, please raise an issue in [Github](https://github.com/apache/sqoop) – Nibrass H Jan 06 '20 at 12:55

1 Answers1

0

These are the specific jar versions that worked for me (mostly Cloudera):

Full script for the Sqoop job shared in this answer.

xgMz
  • 3,334
  • 2
  • 30
  • 23