A DataNode stores data in the HFS (HadoopFileSystem). A functional filesystem has more than one DataNode, with data replicated across them.
Questions tagged [datanode]
86 questions
1
vote
0 answers
Cannot start secure DataNode due to incorrect config?
all principals are created with own keytabs,
.... REALM is HADOOP.COM. HOSTNAME is server.hadoop.com. ssh key are created, I copy the public_key to authorized_key, I gave it read and write for owner chmod 600 . I used Unlimited JCE Extensions for…

Ayoub Ba-haddou
- 21
- 3
1
vote
0 answers
ERROR: Cannot set priority of secondarynamenode process 84665
I recently installed Hadoop on my macbook pro with m1 chip using homebrew. After setting up some necessary configurations, when I tried to start-dfs.sh I got this log:
╰─ start-dfs.sh …

david
- 11
- 2
1
vote
1 answer
I don't want to store any data in hadoop master node. Is that possible?
I have a multinode hadoop cluster setup. 1 master server and 25 slave nodes. The size of the master node is 2T whereas the slaves are 18T each. So I don't want a datanode in my master server because it may cause storage issues in the future. How can…

mash
- 35
- 10
1
vote
0 answers
In HDFS are datanodes online for read/write before it finishes the full block report?
In Apache HDFS, when a DataNode starts, it registers with the NameNode. And a block report happens after some time (not atomic with the register). I haven't fully understood the code but it seems to me the NameNode treats a DataNode that has not…

OrlandoL
- 898
- 2
- 12
- 32
1
vote
0 answers
Data node shuts automatically with error "WARN datanode.DataNode: Exiting Datanode"
I am recieving below error for data node, even resource manager shuts automatically
2021-05-05 01:13:32,029 WARN common.Storage: Failed to add storage directory
[DISK]file:/C:/hadoop/data/datanode
java.io.IOException: Incompatible clusterIDs in…

Gamefic
- 59
- 8
1
vote
0 answers
Add datanode and backup existed data to a standalone Hadooop on windows machine
I have installed a standalone mode Hadoop on windows machine locally, with one datanode and the replication factor set as 1. I have already uploaded some data onto the datanode. Let us call this existing datanode as datanode1.
I would like to add…

XYZ
- 352
- 5
- 19
1
vote
0 answers
AWS EMR Spark is creating files on worker nodes
I am using spark on EMR to process data. Basically i read data from AWS S3 and do the transformation and post transformation i am loading/writing data to oracle tables.
Recently we have found that hdfs(/mnt/hdfs) utilization is going too high.
I am…

distributed_world
- 11
- 1
1
vote
0 answers
Hadoop namenode and secondary nemenode concept
I want to share all about our case.
We have Hadoop cluster with 2 name nodes, one active name node, and one standby name node.
After some time we notice that the active name node and secondary name node are down for 3 days.
After reviewing the name…

jessica
- 2,426
- 24
- 66
1
vote
1 answer
One of datanode's usage reached 100% in hdfs? Balancer is not working
I have some problems with Hadoop hdfs. (Hadoop 2.7.3)
I have 2 namenode (1 active, 1 standby) and 3 datanodes. And replication factor is 3.
$ hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://hadoop-cluster 131.0 T …

soy
- 143
- 1
- 9
1
vote
1 answer
How do I solve the error at the datanode log during Hadoop configuration?
I installed Hadoop in my windows system. Only namenode and resource manager services are running. Remaining services like DataNode, SecondaryNameNode and NodeManager daemons are not visible while using jps cmd. The following error throws in the…

Jefin
- 21
- 3
1
vote
1 answer
There are 1 datanode(s) running and 1 node(s) are excluded in this operation. (pseudodistributed mode)
I am working with hadoop 2.7 using java and i have this error. I can create a file but i can not write in the file:
Errors:
ERROR File /test/1.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running…

Carlos Noé
- 103
- 3
- 11
1
vote
0 answers
how to re balance the HDFS data size on data node disks
we have production cluster with HDP - 2.6.4 version
we have 186 data-node machines ( DELL MACHINES WITH 10 disks )
we try to re balance the data on the disks so disks will be with the same used size but without success
we feel that 2.6.4 version not…

Judy
- 1,595
- 6
- 19
- 41
1
vote
1 answer
What is the communication port between Namenode and Datanode in hadoop cluster
I want to know the communication protocol specifically port number used by Namenode and Datanode in hadoop.
Say, if I write the following command in Namenode,
hdfs dfsadmin -report
it will show the details of live nodes (namenode & datanode), how…

sarwar026
- 3,821
- 3
- 26
- 37
1
vote
0 answers
HDFS datanode Large number of TCP connections in CLOSE_WAIT state
I'm using Apache Druid with a containerized deployment of HDFS in my testbed. After running stably for 5 days, I see one of the HDFS workers is reported as dead on the HDFS UI. Inside the container of this 'dead' worker, I see the process is still…

herlenashavi
- 11
- 3
1
vote
1 answer
How to format datanodes after formatting the namenode on hdfs?
I've recently been settings up hadoop in pseudo distributed mode and I have created data and loaded that into HDFS. Later I have formatted namenode because of a problem. Now when I do that, I find that the directories and the files which were…

Sai Darahaas Ayyangalam
- 13
- 1
- 4