Questions tagged [namenode]

The Hadoop NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept.

The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself.

Client applications talk to the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file. The NameNode responds to the successful requests by returning a list of relevant DataNodes where the data lives.

200 questions
1
vote
1 answer

No resources in GCP Dataproc node to start new SparkSession

I am working on a use case where I have to process a huge amount of data (multiple tables) and I am trying to submit this as a batch job to the Dataproc cluster(PySpark). My code looks something like this from pyspark import SparkContext from…
1
vote
1 answer

hdfs + namenode + edit files increasing with huge size and how to limit the size of edit files

we have HDP cluster with 7 datanodes machines under /hadoop/hdfs/namenode/current/ we can see more then 1500 edit files each file is around 7M to 20M as the following 7.8M …
jessica
  • 2,426
  • 24
  • 66
1
vote
0 answers

Hadoop namenode and secondary nemenode concept

I want to share all about our case. We have Hadoop cluster with 2 name nodes, one active name node, and one standby name node. After some time we notice that the active name node and secondary name node are down for 3 days. After reviewing the name…
jessica
  • 2,426
  • 24
  • 66
1
vote
0 answers

Hadoop Namenode and Secondary Namenode not starting with PDSH exit code 1

aqib@aqib-Inspiron-5521:~$ start-dfs.sh Starting namenodes on [aqib-Inspiron-5521] pdsh@aqib-Inspiron-5521: aqib-Inspiron-5521: ssh exited with exit code 1 Starting datanodes Starting secondary namenodes [aqib-Inspiron-5521] pdsh@aqib-Inspiron-5521:…
1
vote
0 answers

How to recover bad namenode from good namenode

I want to share all , about how to recover second bad namenode when using the good namenode example of one bad namenode and one good namenode So for this scenario lets say the following The bad namenode is on machine hadoop1 The good namenode is…
jessica
  • 2,426
  • 24
  • 66
1
vote
0 answers

Namenode In hadoop cluster not started after electricity failure

We have Hadoop cluster with ambari HDP 2.6.5 version , include 2 namenodes Because electricity failure , both namenode not started On the first namenode we can see the following behavior , with many cycles of replaying edit log xxxxxxxxx…
jessica
  • 2,426
  • 24
  • 66
1
vote
0 answers

Namenode not starting without formatting

I have a hadoop cluster setup manually in high availability with one primary namenode,one standby namenode and one datanode. I formatted the namenode in the initial startup process, but if all the servers shut down due to electricity outage, I have…
1
vote
1 answer

Memory for Namenode(s) in Hadoop

Environment: The production cluster has 2 name-nodes (active and standby namely) and the nodes are SAS drives in Raid-1 configuration. These nodes have nothing but the master services (NN and Standby NN) running on each. They have a Ram of 256GB…
1
vote
1 answer

Hodoop namenode not starting

When I use start-all.cmd, then datanode, resourcemanager, nodemanager are working properly but namenode is not working! 19/11/04 22:09:14 WARN namenode.FSNamesystem: Encountered exception loading fsimage java.io.IOException: NameNode is not…
1
vote
1 answer

One of datanode's usage reached 100% in hdfs? Balancer is not working

I have some problems with Hadoop hdfs. (Hadoop 2.7.3) I have 2 namenode (1 active, 1 standby) and 3 datanodes. And replication factor is 3. $ hdfs dfs -df -h / Filesystem Size Used Available Use% hdfs://hadoop-cluster 131.0 T …
soy
  • 143
  • 1
  • 9
1
vote
1 answer

There are 1 datanode(s) running and 1 node(s) are excluded in this operation. (pseudodistributed mode)

I am working with hadoop 2.7 using java and i have this error. I can create a file but i can not write in the file: Errors: ERROR File /test/1.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running…
Carlos Noé
  • 103
  • 3
  • 11
1
vote
1 answer

Hadoop HA Standby Namenode start up hang in safemode because of MEMORY problem

After a crash of NN-A(active) because of memory not enough (too much blocks/files), we upgrade the NN-A with much more memory, but do not upgrade NN-B(not active) immediately. With difference HeapSize, we deleted some files(80million to 70million),…
rrFeng
  • 191
  • 10
1
vote
1 answer

What is the communication port between Namenode and Datanode in hadoop cluster

I want to know the communication protocol specifically port number used by Namenode and Datanode in hadoop. Say, if I write the following command in Namenode, hdfs dfsadmin -report it will show the details of live nodes (namenode & datanode), how…
sarwar026
  • 3,821
  • 3
  • 26
  • 37
1
vote
1 answer

Why double amount of memory is used for Name Node files?

the Cloudera blog or in hortonwork forum I read:: "Every file, directory and block in HDFS is represented as an object in the namenode’s memory, each of which occupies 150 bytes, as a rule of thumb. So 10 million files, each using a block, would use…
grep
  • 5,465
  • 12
  • 60
  • 112
1
vote
1 answer

hadoop 3.1.2 ./start-all.sh error, syntax error near unexpected token `<'

I'm running hadoop 3.1.2 on mac, and when executing ./start-all.sh, I got error saying Starting namenodes on [localhost] /usr/local/Cellar/hadoop/3.1.2/libexec/bin/../libexec/hadoop-functions.sh: line 398: syntax error near unexpected token…
Bargitta
  • 2,266
  • 4
  • 22
  • 35