18

I'm studying Hadoop and currently I'm trying to set up an Hadoop 2.2.0 single node. I downloaded the latest distribution, uncompressed it, now I'm trying to set up the Hadoop Distributed File System (HDFS).

Now, I'm trying to follow the Hadoop instructions available here but I'm quite lost.

In the left bar you see there are references to the following files:

  • core-default.xml
  • hdfs-default.xml
  • mapred-default.xml
  • yarn-default.xml

But how those files are ?

I found /etc/hadoop/hdfs-site.xml, but it is empty!

I found /share/doc/hadoop/hadoop-project-dist/hadoop-common/core-default.xml but it is just a piece of doc!

So, what files I have to modify to configure HDFS ? Where the deaults values are read from ?

Thanks in advance for your help.

danidemi
  • 4,404
  • 4
  • 34
  • 40
  • For Installing Hadoop 2.2.0 You follow [this link](http://learninghadoopblog.wordpress.com/2013/08/03/hadoop-0-23-9-single-node-setup-on-ubuntu-13-04/). It is for "0.23.9" but it works absolutely fine for "2.2.0" – Rushikesh Garadade Jan 27 '14 at 08:28

5 Answers5

18

These files are all found in the hadoop/conf directory.

For setting HDFS you have to configure core-site.xml and hdfs-site.xml.

HDFS works in two modes: distributed (multi-node cluster) and pseudo-distributed (cluster of one single machine).

For the pseudo-distributed mode you have to configure:

In core-site.xml:

<!-- namenode -->
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:8020</value>
</property>

In hdfs-site.xml:

<-- storage directories for HDFS - the hadoop.tmp.dir property, whose default is /tmp/hadoop-${user.name} -->
<property>
    <name>hadoop.tmp.dir</name>
    <value>/your-dir/</value>
</property>

Each property has its hardcoded default value.

Please remember to set ssh password-less login for hadoop user before starting HDFS.

P.S.

It you download Hadoop from Apache, you can consider switching to a Hadoop distribution:

Cloudera's CDH, HortonWorks or MapR.

If you install Cloudera CDH or Hortonworks HDP you will find the files in /etc/hadoop/conf/.

proutray
  • 1,943
  • 3
  • 30
  • 48
Evgeny Benediktov
  • 1,389
  • 1
  • 10
  • 13
  • 1
    Thanks to pointing me to Cloudera's CDH. Currently my intention is to understand how to work with Hadoop from scratch, if you know what I mean, just to understand at least all the pieces. I agree a distribution like the one you pointed at could be quicker solution. – danidemi Feb 02 '14 at 16:11
  • 1
    In that case I recommend you to read Oreilly's Hadoop Definitive Guide 2012 Edition. – Evgeny Benediktov Feb 02 '14 at 17:38
5

For Hortonworks location would be

/etc/hadoop/conf/hdfs-site.xml
Indrajeet Gour
  • 4,020
  • 5
  • 43
  • 70
4

All the configuration files will be located in the extracted tar.gz file in the etc/hadoop/ directory. The hdfs-site.xml may be hdfs-site.xml.template. You will need to rename it to hdfs-site.xml.

If you want to see what options for hdfs check the doc in the tarball in share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

Chris Hinshaw
  • 6,967
  • 2
  • 39
  • 65
1

these files can be seen here /usr/lib/hadoop-2.2.0/etc/hadoop, in that location u can find all the XMLs.

Batty
  • 121
  • 1
  • 5
1

For hadoop 3.2, the default config can be found at:

Eric
  • 22,183
  • 20
  • 145
  • 196