Questions tagged [hadoop]

Hadoop is an open-source solution for providing a distributed/replicated file system, a produciton grade map-reduce system, and has a series of complementary additions like Hive, Pig, and HBase to get more out of a Hadoop-powered cluster.

Hadoop is an Apache foundation sponsored project, with commercial support provided by multiple vendors, including Cloudera, Hortonworks, and MapR. Apache has a more complete set of commercial solutions documented.

Available complementary additions to Hadoop include:

Hadoop distributed filesystem ( standard )
The map-reduce architecture ( standard )
Hive, which provides a SQL like interface to the M/R arch
Hbase, a distributed key-value service

Recommended reference sources:

Hive Language Reference

261 questions

vote

0 answers

How to debug policy enforcement on YARN queues?

I have a HDP 3.1 cluster and it seems that the fair policy isn't behaving as expected or YARN is misconfigured, since some users/applications/jobs are consuming more resources than we supposed it to use. So how do I debug/monitor YARN in a way that…

hadoop

asked Aug 10 '21 at 20:24

jguilhermemv

vote

0 answers

Backup and Restore strategy in Hbase cluster

I have just started with Hbase cluster. I have a Hbase cluster with 2 master nodes and 4 slave nodes. I have one hbase table where huge data is populated everyday so the disk gets filled quickly. I would like to implement a backup and restore…

backup-restoration hadoop hbase

asked Dec 22 '20 at 05:38

Juvenik

vote

2 answers

NoNode for HBase master pseudodistributed mode

I am using Ubuntu 18.04, hadoop 3.1.3 and hbase 2.2.1 To me it seems like my hadoop and HBase are not configured correctly to interact. When I through the HBase shell try to create a table it yields me with following error ERROR: KeeperErrorCode =…

ubuntu hadoop hbase

asked Mar 13 '20 at 12:21

Jonas Grønbek

votes

1 answer

mkfs + xfs + what is the right mkfs cli in order to create xfs file-system on huge disk

We need to create xfs file-system on kafka disk The special thing about kafka disk is the disk size kafka disk have 20TB size in our case I not sure about the following mkfs , but I need advice to understand if the following cli , is good enough to…

hard-drive xfs hadoop kafka mkfs

asked Dec 15 '19 at 13:56

shalom

votes

1 answer

is it possible mix different RHEL OS version in hadoop cluster?

we are using the following HDP cluster with ambari , list of nodes and their RHEL version 3 masters machines ( with namenode & resource manager ) , installed on RHEL 7.2 312 DATA-NODES machines , installed on RHEL 7.2 5 kafka machines , installed…

redhat rhel7 hadoop hdfs apache-spark

asked Nov 20 '19 at 19:48

shalom

votes

0 answers

Any benefits of ZFS over EXT4 for data stream processing on top of HDFS?

I'm working on a data stream processing project in which i will be using Apache Flink and Apache Spark and I want to use HDFS for storage. The development and testing will be done on a single node cluster with multiple physical disks. I have already…

zfs ext4 hadoop hdfs apache-spark

asked Oct 08 '19 at 15:36

HUSMEN

votes

0 answers

Request Time Out / Sessions Stalling through IPTABLE (DNAT)

Scenario: Customer recently Migrated Clustered HANA DB Servers to Azure Cloud Platform but these are Physical Servers on Azure (Offering: Azure HLI). Usually these HLIs (HANA DB Servers) in Azure cannot be accessible directly, even not from Azure…

linux iptables hadoop dnat sles11

asked Jul 29 '19 at 14:33

Ram Too

votes

1 answer

transferring data between two hadoop clusters without direct network connectivity

I have a need to transfer data fairly regularly (on demand, not scripted / streamed) between two independent hadoop clusters. One of which is deployed in an isolated network and has no direct access to another. I tried searching the official…

proxy hadoop

asked Jul 10 '19 at 10:03

vdrandom

votes

1 answer

hadoop + can we install zookeeper servers on kafka hosts

we want to dedicated the zookeeper servers only for kafka machines so each kafka machine include the zookeeper server and zookeeper server will serve only the kafka host and not other application in that case is it ok?

hadoop kafka zookeeper

asked Jul 01 '19 at 05:46

shalom

votes

2 answers

High Active(file) Memory Usage in Oracle Linux VMs

I recently searched and read lots of posts and questions about Linux memory management but I can't find my case. For example, there is a question in Unix StackExchange about High memory usage but no process is using it. In this post, the accepted…

linux virtual-machines memory linux-kernel hadoop

asked Apr 23 '19 at 06:25

Mahdizade

votes

1 answer

HDFS balancing , how to balanced hdfs data?

we have Hadoop version - 2.6.4 On the datanode machine we can see that hdfs data isn’t balanced On some disks we have different used size as sdb 11G and sdd 17G /dev/sdd 20G 3.0G 17G 15% /grid/sdd /dev/sdb 20G 11G 9.3G 53% /grid/sdb <-- WHY…

linux hadoop hdfs big-data

asked Mar 07 '19 at 17:23

shalom

votes

1 answer

Install Nvidia Drivers 9.0 for TensorFlow pip (Debian 9.7)

I installed Nvidia drivers 9.1 on my Debian 9.7 (Dataproc) when I try to run TensorFlow 1.9 via this test script it fails: Used this guide to install GPU Drivers: https://cloud.google.com/dataproc/docs/concepts/compute/gpus Used pip install…

debian google-cloud-platform hadoop nvidia

asked Feb 11 '19 at 23:11

gogasca

votes

1 answer

How to configure Kerberos authentication on the browsers which are on CITRIX page?

We are connecting to our secure client network via CITRIX. We are using chrome to open all quick links. like ambari etc. They open and we are good there, but other useful links like RM and HISTORY server links, do not open as it needs kerberos…

kerberos citrix hadoop resource-management

asked Dec 20 '18 at 14:08

akash sharma

votes

0 answers

Avoid kafka disk to became 100% used by Cron job

We want to suggest the following based on our issues on kafka disks We have many HDP clusters ( based on ambari , and all machines are redhat version 7.2 ) Each cluster include 3 kafka machines , while each kafka include disk with ~15 T Because we…

linux hadoop kafka big-data

asked Nov 05 '18 at 19:02

shalom

votes

1 answer

Free Account Azure version HDInsight and cores issue

I am using an Free Azure account version and I am trying to create the resources needed to put in place HDInsight. I have done it twice, but in order to spare the time/money I have available, I have deleted the resource group. Unortunately now that…

azure hadoop microsoft

asked Sep 27 '18 at 09:15

Nicola

Prev 1 2 3

…

17 18 Next