Questions tagged [hadoop3]

Use for questions specific to Apache Hadoop 3.0 features (i.e. Erasure Coding, YARN Timeline Service v2, Opportunistic Containers, 3+ NameNode fault-tolerance). For general questions related to Apache Hadoop use the tag [hadoop].

112 questions
0
votes
1 answer

PyArrow OSError: [WinError 193] %1 is not a valid win32 application

My OS is Windows 10 64 bit and I use Anaconda 3.8 64 bit. I try to develop Hadoop File System 3.3 client with PyArrow module. Installation of PyArrow with conda on windows 10 is successful. > conda install -c conda-forge pyarrow But connection of…
Joseph Hwang
  • 1,337
  • 3
  • 38
  • 67
0
votes
1 answer

Cannot do hadoop jar command on Hadoop 3.2.1 : failed on connection exception: java.net.ConnectException: Connection refused;

I've installed Hadoop 3.2.1 in my Ubuntu 20.04 on Virtualbox for my college study and college's deadline so I'm new in Hadoop. And I've searching several source in internet how to mapreduce on Hadoop. But, when I type this on terminal: hadoop jar…
pup_in_the_tree
  • 169
  • 1
  • 1
  • 8
0
votes
1 answer

Hadoop3 balancer vs disk balancer

I read Hadoop ver 3 document about disk balancer and it said "Diskbalancer is a command line tool that distributes data evenly on all disks of a datanode. This tool is different from Balancer which takes care of cluster-wide data balancing." I…
eyeballs
  • 169
  • 1
  • 3
  • 15
0
votes
2 answers

Hadoop web interface is not working even though the nodes are starting

I'm trying to install Hadoop v3.1.3 in pseudo-distributed mode in my Ubuntu 18.04 environment. After following the documentation word-by-word, my web interface is still not working i.e. localhost:9870 yields no result. Log files are getting created…
Utsav
  • 9
  • 6
0
votes
1 answer

Hadoop 3.2.1 localhost: ERROR: You must be a privileged user in order to run a secure service

I am trying to install a simple hadoop setup on Ubuntu 20 on windows WSL. I am able to get NameNode and Yarn running but the Datanodes is failing Getting the following error while trying to start-dfs.sh hadoopuser@mycompu:~/hadoop$…
virtuvious
  • 2,362
  • 2
  • 21
  • 22
0
votes
1 answer

Not able to create hive table using sqoop

I am trying below command to import the mysql table stocks to my hive(v3.1.2) in Ubuntu 18.0.4 and Hadoop 3 using sqoop(v1.4.7) sqoop import --connect jdbc:mysql://localhost/myhadoop --username hiveuser --password xxx --table stocks --bindir…
suneesh
  • 31
  • 2
0
votes
0 answers

Hadoop YARN resource manager not able start due to error

I am trying to run Hadoop (HDFS and YARN) in multi-node cluster (2 nodes) but the resource manager fails to start on slave node. Basically, it fails due to the below exception - not able to find a class called javax.activation.DataSource (which is…
Learner
  • 533
  • 5
  • 18
0
votes
1 answer

The sqoop is not working on my ubuntu 18.04 with hadoop 3.1.3

I am getting below error in my Ubutnttu(18.0.4) machine while launching sqoop(1.4.7,Hadoop-3.1.3) command used: sqoop import --connect jdbc:mysql://localhost/myhadoop --username hiveuser --password xxxx --table employee --split-by --target-dir…
suneesh
  • 31
  • 2
0
votes
1 answer

Hadoop Client unable to connect to datanode

I have single node hadoop cluster on ec2. Tried to give all posible combinations in slaves file. May 01 2020 08:16:25.227 DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.31.45.114:9866 May 01 2020 08:16:25.227 DEBUG…
0
votes
0 answers

Could not find or load main class hdfs problem

I am trying to use Apache Rya for some tests (https://rya.apache.org/). For those who are familiar with Rya and RDF stores, I am trying to do a bulk loading which is explained here:…
M.Taki_Eddine
  • 160
  • 2
  • 11
0
votes
0 answers

ERROR: datanode can only be executed by harry

I want to start all (namenode and datanode) but when I used this command start-all.sh it returned: ERROR: datanode can only be executed by harry How to fix this?
Key Jun
  • 440
  • 3
  • 8
  • 18
0
votes
1 answer

Secondary Name Node in Hadoop

Suppose for checkpoint default time is 1hr. If Name Node goes down after 55min from last checkpoint. We loss the last 55 min data(edit log file data is not added in fsImage)?
0
votes
1 answer

Dask - trying to read hdfs data getting error ArrowIOError: HDFS file does not exist

I tried creating a dataframe from csv stored in hdfs. Connecting is successful. But when trying to get output of len function getting error. Code: from dask_yarn import YarnCluster from dask.distributed import Client, LocalCluster import…
Nirmal Ram
  • 1,180
  • 2
  • 9
  • 18
0
votes
0 answers

Hadoop 3.2 : No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster)

I have a local Hadoop 3.2 installation : 1 master + 1 worker both running in my laptop. This is an experimental setup to make quick tests before submitting to a real cluster. Everything is in good health: $ jps 22326 NodeManager 21641…
David Guyon
  • 2,759
  • 1
  • 28
  • 40
0
votes
3 answers

HADOOP 3.1.2 Namenode not starting

I am new to Hadoop, so I would really appreciate any feedback on this issue. The Hadoop setup seems fine. I am able to start it, but when I checked the web UI at: http://localhost:50070 or http://localhost:9870 it shows the site can't be reached.…
Marguerite
  • 1
  • 1
  • 4