Hadoop HADOOP_CLASSPATH issues

Question

This question doesn't refer to distributing jars in the whole cluster for the workers to use them.

It refers to specifying a number of additional libraries on the client machine. To be more specific: I'm trying to run the following command in order to retrieve the contents of a SequenceFile:

   /path/to/hadoop/script fs -text /path/in/HDFS/to/my/file

It throws me this error: text: java.io.IOException: WritableName can't load class: util.io.DoubleArrayWritable

I have a writable class called DoubleArrayWritable. In fact , on another computer everything works well.

I tried to set the HADOOP_CLASSPATH to include the jar containing that class but with no results. Actually, when running:

   /path/to/hadoop/script classpath

The result doesn't contain the jar which I added to HADOOP_CLASSPATH.

The question is: how do you specify extra libraries when running hadoop (by extra meaning other libraries than the ones which the hadoop script includes automatically in the classpath)

Some more info which might help:

I can't modify the hadoop.sh script (nor any associated scripts)
I can't copy my library to the /lib directory under the hadoop installation directory
In the hadoop-env.sh which is run from the hadoop.sh there is this line: export HADOOP_CLASSPATH=$HADOOP_HOME/lib which probably explains why my HADOOP_CLASSPATH env var is ignored.

Lorand Bendig · Accepted Answer · 2012-10-18T21:26:15.610

15

If you are allowed to set HADOOP_CLASSPATH then

export HADOOP_CLASSPATH=/path/to/jar/myjar.jar:$HADOOP_CLASSPATH; \
    hadoop fs -text /path/in/HDFS/to/my/file

will do the job. Since in your case this variable is overridden in hadoop-env.sh therefore, consider using the -libjars option instead:

hadoop fs -libjars /path/to/jar/myjar.jar -text /path/in/HDFS/to/my/file

Alternatively invoke FsShell manually:

java -cp $HADOOP_HOME/lib/*:/path/to/jar/myjar.jar:$CLASSPATH \
org.apache.hadoop.fs.FsShell -conf $HADOOP_HOME/conf/core-site.xml \
-text /path/in/HDFS/to/my/file

edited Oct 18 '12 at 21:26

answered Oct 17 '12 at 21:20

Lorand Bendig

10,630
1
38
45

yes but the hadoop script does this: export HADOOP_CLASSPATH=$HADOOP_HOME/lib. It rewrites my HADOOP_CLASSPATH – Razvan Oct 17 '12 at 22:32
as I said, I don't want to set the "distributed" classpath. I just want to set the classpath on the client machine – Razvan Oct 18 '12 at 14:09
@Razvan : Ok. Then you can invoke FsShell manually (see the example above). – Lorand Bendig Oct 18 '12 at 21:32

score 4 · Answer 2 · answered Mar 08 '19 at 11:23

4

If someone wants to check hadoop classpath, enter hadoop classpath in terminal.
To compile it, use this: javac -cp $(hadoop classpath):path/to/jars/* java_file.java

answered Mar 08 '19 at 11:23

subtleseeker

4,415
5
29
41

score 0 · Answer 3 · answered Oct 18 '12 at 05:51

0

Try to add your jar file in default CLASSPATH variable and also append HADOOP_CLASSPATH to it. Then execute your command.

export CLASSPATH=/your/jar/file/myjar.jar:$CLASSPATH:$HADOOP_CLASSPATH /path/to/hadoop/script fs -text /path/in/HDFS/to/my/file

answered Oct 18 '12 at 05:51

Rahul Mahajan

109
5

4

the default classpath var is ignored by hadoop – Razvan Oct 18 '12 at 20:56

Hadoop HADOOP_CLASSPATH issues

3 Answers3

Linked