0

I am a newbie. Trying to setup a hdfs system to serve my data (I don't plan to use mapreduce) at my lab.

So far I have read, cluster setup in but I am still confused. Several questions:

  • Do I need to have a secondary namenode?
  • There are 2 files, masters and slaves. Do I really need these 2 files eventhough I just want hdfs? If I need them, what should go in there? I assume my namenode in masters and datanodes as slaves? Do I need slaves nodes
  • What configuration files are needed for namenode, secondary namenode, datanode and client? (I assume core-site.xml is needed for all 4)? In addition, can someone suggest a good configuration model? sample configuration for namenode, secondary namenode, datanode, and the client would be very helpful.

I am getting confused because it seems most of the documentation assumes I want to use map-reduce which isn't the case.

Ananymous
  • 1
  • 1

1 Answers1

1

To answer your 1st 2 questions 1. No you do not need secondary namenode if you dont care if the the namenode crashes 2. You need slave files to start the datanode daemons from the namenode using hadoop bash command start-dfs.sh. You do not need masters file if you do not want to use secondard namenode

For your 3rd question There is step by step documentation on how to install a small hadoop cluster at http://www.hadoop-blog.com/2010/11/how-to-quickly-install-hadoop-020-in.html

Please go thru it, you can skip the steps that talk about JobTracker and Tasktrackers and that should be enough to start your HDFS.

Aman
  • 41
  • 2