Questions tagged [hadoop2]

Hadoop 2 represents the second generation of the popular open source distributed platform Apache Hadoop.

Apache Hadoop 2.x consists of significant improvements over the previous stable release of Hadoop aka Hadoop 1.x. Several major enhancements have been made to both the building blocks of Hadoop viz, HDFS and MapReduce. They are :

  1. HDFS Federation : In order to scale the name service horizontally, federation uses multiple independent Namenodes/Namespaces.

  2. MapReduce NextGen aka YARN aka MRv2 : The new architecture divides the two major functions of the JobTracker, resource management and job life-cycle management, into separate components. The new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application‚ scheduling and coordination. An application is either a single job in the sense of classic MapReduce jobs or a DAG of such jobs. The ResourceManager and per-machine NodeManager daemon, which manages the user processes on that machine, form the computation fabric.

For more info on Hadoop 2 the official Hadoop 2 homepage can be visited.

2047 questions
0
votes
0 answers

Datanode not able to contact name node on AWS

I am trying to setup a hadoop cluster on AWS with two datanodes and one namenode. I have followed tutorial point multinode cluster setup. I have started the name node and secondary name node and datanodes from the namenode server. Datanode is…
0
votes
1 answer

Which class connects mapreduce job to its dataset in hadoop source code?

I've read the classes in hadoop-common/src/util, but I can't find the class which relates the job to its dataset. How does Hadoop know which map reduce job relates to which dataset?
fatima
  • 1
0
votes
2 answers

Apache PIG, JSON Loader

This is my sample input file: [{"disknum":36,"disksum":136.401,"disk_rate":1872.0,"disk_lnum": 13}] [{"disknum":36,"disksum":105.2,"disk_rate":123084.8,"disk_lnum": 13}] I'm trying to parse this JSON data using JsonLoader in PIG, Here's is my…
Rohit Nimmala
  • 1,459
  • 10
  • 28
0
votes
1 answer

Text to String map reduce

I am trying to split a string using mapreduce2(yarn) in Hortonworks Sandbox. It throws a ArrayOutOfBound Exception if I try to access val[1] , Works fine with when I don't split the input file. Mapper: public class MapperClass extends…
Sundari
  • 33
  • 7
0
votes
2 answers

Hadoop cannot see my input directory

I am following the Apache Map Reduce tutorial and I am at the point of assigning input and output directories. I created both directories here: ~/projects/hadoop/WordCount/input/ ~/projects/hadoop/WordCount/output/ but when I run fs, the file and…
Slinky
  • 5,662
  • 14
  • 76
  • 130
0
votes
0 answers

Spark 1.6.3 configuration on Hadoop 2.7.3 in fully distributed mode

I am trying to configure spark but getting ERROR SparkContext: Error initializing SparkContext. I copied hdfs-site.xml and core-site.xml file from $HADOOP_HOME/etc/hadoop, also copied hive-site.xml and put it in $SPARK_HOME/conf folder. Then I added…
Mahmud
  • 87
  • 10
0
votes
2 answers

Spark and HBase version compatibility

I am trying to integrate Spark and Hbase 1.2.4. I am currently using hadoop 2.7.3. Can somebody tell me which version of Spark is compatible with HBase 1.2.4?
Mahmud
  • 87
  • 10
0
votes
0 answers

"HMaster" service starts always on the master backup machine/node

I have installed hbase over hadoop 2 successfully. I did this using one master node and three data nodes (one of them is the master backup). I've been working on that with no problem for while until I restarted all the nodes/machines. Everything…
abutmah
  • 63
  • 1
  • 3
  • 9
0
votes
0 answers

DB2 "with ur" in Hive

I am migrating quite a few data processing jobs from DB2 to Hive. I came across a "select" query in DB2 ending with the clause "with ur" like the below one: select field1, field2 from table1 where field3=value1 with ur The "with ur" clause is known…
Marco99
  • 1,639
  • 1
  • 19
  • 32
0
votes
0 answers

Unable to start namenode in Hadoop

I am installing Hadoop 2.7.3 on my Ubuntu 16.0.4 system. I am getting following error while trying to execute start-dfs.sh. I have checked all configuration files. node@hellbot:~$ start-dfs.sh 17/01/28 20:46:26 WARN util.NativeCodeLoader: Unable to…
Kumar-58
  • 47
  • 6
0
votes
1 answer

Spark: Spark UI not reflecting the right executor count

We are running a spark streaming application where we want to increase the number of executors spark uses ....so updated spark-default.conf increasing spark.executor.instances from 28 to 40 but the change is not reflected in the UI 1 Master/Driver…
user2359997
  • 561
  • 1
  • 16
  • 40
0
votes
1 answer

How Name node update availability of Data Nodes for HDFS writes in Hadoop

I have 10 Data nodes the replication factor is 3,file size is 150 and the block size is 64. So file will be splits into three blocks B1,B2,B3. So client asks Name Node for the availability of Data nodes for writing B1 block. My question is how many…
sidhartha pani
  • 623
  • 2
  • 12
  • 23
0
votes
1 answer

Where are HDFS directories created in Hadoop?

I am running a simple, get-my-feet-wet, map reduce job, in pseudo-distributed mode as so: bin/hadoop jar tm.jar TestMap input output It ran fine the first time but on the second run, I am getting the following: Exception in thread "main"…
Slinky
  • 5,662
  • 14
  • 76
  • 130
0
votes
0 answers

Talend: tHiveInput throwing error only when 'Where' clause is used in the Query. Works fine if i remove the clause

I am a newbie to both Talend and Hive. I want to query the Hive table and output the data to csv file. Created tHiveConnection (I was able to connect to Hive Database) tHiveInput (used the use existing connection and wrote the Query) connected…
Trini
  • 349
  • 1
  • 5
  • 19
0
votes
0 answers

DataNode is not starting on hadoop multinode cluster

I removed contents from hadoop tmp directory, dropped current folder from namenode directory, formatted namenode but got an exception: org.apache.hadoop.http.HttpServer2: HttpServer.start() threw a non Bind IOException java.net.BindException: Port…
Mahmud
  • 87
  • 10
1 2 3
99
100