For questions regarding the Hadoop distributed file system (HDFS) which is part of the Apache Hadoop project.
Questions tagged [hdfs]
71 questions
0
votes
1 answer
Hadoop: How to configure failover time for a datanode
I need to re-replicate blocks on my HDFS cluster in case of a datanode is failing. Actually, this appears to already happen after a period of maybe 10min. However, I want to decrease this time, but wondering how to do so.
I tried to set…

frlan
- 573
- 1
- 8
- 27
0
votes
2 answers
error while running any hadoop hdfs file system command
I am very new to hadoop and referring to the "hadoop for dummies" book.
I have a VM with following specs: hadoop version 2.0.6-alpha bigtop os centos
The problem is when I run any hdfs file system command I get following error:
hadoop hdfs dfs -ls…

Raj Kumar Rai
- 3
- 1
- 1
- 3
0
votes
1 answer
Compiling hdfs-fuse bundled with Hadoop
I am trying to compile the hdfs-fuse extension from Hadoop 0.20.2 on a machine running Fedora 14. Below are the packages I have installed:
fuse-2.8.5-2.fc14.x86_64
fuse-libs-2.8.5-2.fc14.x86_64
fuse-devel-2.8.5-2.fc14.x86_64
Then, I have…

Laurent
- 321
- 3
- 14
0
votes
1 answer
hdfs configuration
I am a newbie. Trying to setup a hdfs system to serve my data (I don't plan to use mapreduce) at my lab.
So far I have read, cluster setup in but I am still confused.
Several questions:
Do I need to have a secondary namenode?
There are 2 files,…

Ananymous
- 1
- 1
0
votes
1 answer
Does VM machine can replace physical machine,
We have 254 Physical servers when all machines are DELL servers R740.
servers are part of Hadoop cluster. most of them are holding HDFS filesystem and data node & node manager services, part of them are Kafka machines.
The OS that installed on the…

King David
- 549
- 6
- 20
0
votes
1 answer
Hadoop recommissioning datanode
Do I need to delete all data from a datanode before recommissioning it, or it doesn't matter and the namenode will not pick stale data from the datanode?
0
votes
1 answer
Change HDFS replication factor
I've changed replication factor from 3 to 2 for some directories with command:
hdfs dfs -setrep -R 2 /path/to/dir
but my HDFS free space still the same. Should I do something else to free my disks?

John Brown
- 1
- 1
0
votes
1 answer
HDFS. How to free 1 particular disk
I have cluster with 3 servers. 2 of them have 2 TB disks and another one have 500 Gb SSD. I am trying to use balancer, but I still get 70% of usage on 2TB disks and 99% on 500Gb due to non-dfs files. Replication coefficient=2. Is it possible to free…

John Brown
- 1
- 1
0
votes
1 answer
Hadoop Cluster Capacity Planning of Data Nodes for disks per data node
we are planing to build hadoop cluster with 12 data nodes machines
when the replication factor is 3
and DataNode failed disk tolerance - 1
data nodes machines are include the disks for HDFS
since we not found the criteria for how many disks need…

King David
- 549
- 6
- 20
0
votes
1 answer
Optimal RAID configuration for EC2 instance store used for HDFS
I'm trying to determine if there is any practical advantage to configuring a RAID array on the instance store of a 3x d2.2xlarge instances being used for HDFS. Initially I planned to just mount each store and add it as an additional data directory…

John R
- 383
- 4
- 13
-2
votes
1 answer
How does placing data in various racks help to exploit the fact that intra-rack aggregated bandwidth>=inter-rack bandwidth?
GFS research paper snapshot
it says that(my interpretation after reading research paper and its reviews) "inter rack bandwidth is lower than aggregated intra rack bandwidth(not sure what it means by aggregated, it doesn't make much sense of kind of…

gibmegucci
- 1
- 1