2

I am trying to get some table data imported from PostgreSQL to HDFS using Sqoop. Now due to licensing constraints, Sqoop does not come packaged with JDBC drivers for all JDBC compliant databases. PostgreSQL is one of them. In order to interact with this database, Sqoop needs the relevant JDBC driver to be installed into a preset classpath (typically $SQOOP_HOME/lib).

In my case, the Hadoop administrator does not provide me write access to this predefined classpath. Is there any alternate way to instruct Sqoop client to look into some path (say, my home directory) instead of or in addition to the preset location?

I looked into the official Apache documentation and searched the internet, but could not fetch any answer. Could anyone please help?

Thanks !

Tapas Saha
  • 61
  • 1
  • 5
  • This [question](http://stackoverflow.com/questions/19597328/sqoop-sqlserver-failed-to-load-driver-appropriate-connection-manager-is-not-be) might be useful. – Dev Aug 30 '16 at 08:13

3 Answers3

4

I got this working yesterday. Below are the steps to follow.

  1. Download the appropriate JDBC driver from here
  2. Put the jar file under the directory of choice. I chose the hadoop cluster user's home directory i.e. /home/myuser
  3. export HADOOP_CLASSPATH="/home/myuser/postgresql-9.4.1209.jar" (replace /home/myuser/postgresql-9.4.1209.jar with your path and jar file name)
  4. To perform Sqoop import you may use the below command.

    sqoop import --connect 'jdbc:postgresql://<postgres_server_url>:<postgres_port>/<db_name>' --username <db_user_name> --password <db_user_password> --table <db_table_name> --warehouse-dir <existing_empty_hdfs_directory>

  5. To perform Sqoop export you may use the below command.

    sqoop export --connect 'jdbc:postgresql://<postgres_server_url>:<postgres_port>/<db_name>' --username <db_user_name> --password <db_user_password> --table <db_table_name> --export-dir <existing_hdfs_path_containing_export_data>

Tapas Saha
  • 61
  • 1
  • 5
  • This doesn't fix sql anywhere with sqoop `/opt/sqlanywhere17/java/sajdbc4.jar`, I still get no `dbjdbc17 in java.library.path` – Petro Oct 22 '19 at 13:22
2

As per Sqoop docs,

-libjars <comma separated list of jars>- specify comma separated jar files to include in the classpath.

Make sure you use -libjars as first argument in the command.

EDIT :

According to docs,

The -files, -libjars, and -archives arguments are not typically used with Sqoop, but they are included as part of Hadoop’s internal argument-parsing system.

So, JDBC client jars need to be put at $SQOOP_HOME/lib.

Dev
  • 13,492
  • 19
  • 81
  • 174
  • 1
    I tried this sqoop import --libjars /home/myuser/postgresql-9.4.1209.jar --connect 'jdbc:postgres://:5432/' --username -P --table And get the below error ERROR tool.BaseSqoopTool: Got error creating database manager: java.io.IOException: No manager for connect string: jdbc:postgres://xxxx.xxxxx.xxxx:5432/xxxxxx at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:192) at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:256) at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:89) ..... – Tapas Saha Aug 30 '16 at 07:10
  • @TapasSaha I tried at my end and got the same error. I updated answer with details – Dev Aug 30 '16 at 07:30
0

I had recently experienced issue with this -libjars option. It doesn't work perfectly. Probably this issue is propagated from Hadoop jar command line option. Possible option is to specify your extra jars using HADOOP_CLASSPATH environmental variable.

You have to export path to your driver jar file.

export HADOOP_CLASSPATH=<path_to_driver_jar>.jar

After this, it can correctly pick up the jar file you specified. -libjars option doesn't correctly pick the file. I noticed this in sqoop version 1.4.6.

Piyush Patel
  • 1,646
  • 1
  • 14
  • 26