Questions tagged [hadoop3]

Use for questions specific to Apache Hadoop 3.0 features (i.e. Erasure Coding, YARN Timeline Service v2, Opportunistic Containers, 3+ NameNode fault-tolerance). For general questions related to Apache Hadoop use the tag [hadoop].

112 questions
1
vote
1 answer

What is Unmanaged Application Master and its role in the yarn federation hadoop?

I am not getting much information about working of Unmanaged AM. I just know the basic definition about it but still not sure how their management is done and by whom it is done? Also in apache document, it is mentioned (point 8 in job execution…
anand011090
  • 65
  • 10
1
vote
0 answers

Can't access the web interface in Hadoop version 3

I'm running Ubuntu 18 in a VMware workstation, and have followed these instructions: https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-in-stand-alone-mode-on-ubuntu-18-04 It works, but I can't access the web interface on…
Patrick
  • 43
  • 4
1
vote
1 answer

Hadoop 3 : how to configure / enable erasure coding?

I'm trying to setup an Hadoop 3 cluster. Two questions about the Erasure Coding feature : How I can ensure that erasure coding is enabled ? Do I still need to set the replication factor to 3 ? Please indicate the relevant configuration properties…
Klun
  • 78
  • 2
  • 25
1
vote
1 answer

Datanode started but not shown in dfsadmin -report

I am trying to install Hadoop 3.1.0 into two virtual machines: The first machine contains one name node and one data node, the second contains one data node. I followed this article Install Hadoop 3.0.0 multi-node cluster on Ubuntu. And every goes…
Hadi
  • 36,233
  • 13
  • 65
  • 124
1
vote
1 answer

Hadoop 3.0 erasure coding - determining the number of acceptable node failures?

In hadoop 2.0 the default replication factor is 3. And the number of node failures acceptable was 3-1=2. So on a 100 node cluster if a file was divided in to say 10 parts (blocks), with replication factor of 3 the total storage blocks required are…
samshers
  • 1
  • 6
  • 37
  • 84
1
vote
1 answer

How to configure the erasure coding feature in hadoop3 and is it used for storing cold files only by default?

As per the Hadoop 3.x release notes, they have introduced Erasure coding to overcome the problems with storage. Erasure coding is a method for durably storing data with significant space savings compared to replication. Standard encodings like…
1
vote
0 answers

GPU resource for hadoop 3.0 / yarn

I try to use Hadoop 3.0 GA release with gpu, but when I executed the below shell command, there is an error and not working with gpu. please check the below and just let you know the shell command. I guess that there are misconfigurations from me.…
Kangrok Lee
  • 101
  • 13
0
votes
0 answers

compile native hadoop3.0.0 failed: hadoop-common: make failed with error code 2

I have installed all the neccessary software in BUILDING.txt. I run #mvn package -Pdist,native -DskipTests -Dmaven.javadoc.skip=true -DskipShade -e -Drequire.isal It shows something below. [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop Main…
0
votes
0 answers

I set up remote Hive postgres meta .. but while accessing , i get "DBS" table insert error (DDL is not matching)

Hive 3.1.3, PG 12 - remote meta, changed spark and hive site.xml used schematool to populate default tables USING oracle object storage as hadoop storage. I have replaced actual path with place holder Note: I have hive and spark on same server.…
Harish
  • 969
  • 2
  • 10
  • 15
0
votes
1 answer

Hive with TEZ failed to start Hive CLI

Dears, Apache Hive 3.1.2 with Hadoop 3.1.1 was working fine with until i configured Hive with Tez based on this doc :[ https://github.com/NitinKumar94/Installing-Apache-Tez] it always gives this error, i tried many solutions with no luck …
Kaher
  • 3
  • 1
0
votes
1 answer

pyspark-connect can't show all hive databases

I'm using pyspark3.4.0 feature spark-connet module to connect remote hive 3.1.3. When create sparksession in local mode with hive supported, all data base in hive can be viewed; spark =…
leon
  • 1
0
votes
0 answers

Test cases fail with permission denied error with hadoop-minicluster initialization for 3.2.2 version

I am trying to run Junits for a spark project in intelliji. Junits initialize local hadoop cluster using hadoop-minicluster dependency. Tests run fine with hadoop version - 2.7.3.2.6.5.0-292. Since we upgraded the our environment, l need to rebuild…
mbr
  • 11
  • 2
0
votes
0 answers

Install functionality in Hadoop Cluster in Docker

what I'm trying to do is to use multiple datanodes on a single machine in order to test the erasure code features implemented by version 3.0. I'm using Docker with Big Data Europe's containers (https://github.com/big-data-europe/docker-hadoop) and a…
0
votes
0 answers

how to install hbase 2.4 with hadoop 3.3

I've currently an hadoop cluster in 2.8 and a hbase cluster in 2.4. Hbase installation is quite simple : I download binaries on hbase.apache.org, untar them, set the java home + zookeeper, update hbase-site.xml + all conf files and that's all. I've…
0
votes
0 answers

EC2 User data not formatting bash variables

I have a line in my ec2 instance user data that is seen below cat <> $HOME/.bashrc export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 export HADOOP_HOME=$HOME/hadoop export HADOOP_CONF=$HADOOP_HOME/etc export…
Temmyzeus
  • 36
  • 2