Questions tagged [hadoop]

Hadoop is an open-source solution for providing a distributed/replicated file system, a produciton grade map-reduce system, and has a series of complementary additions like Hive, Pig, and HBase to get more out of a Hadoop-powered cluster.

Hadoop is an Apache foundation sponsored project, with commercial support provided by multiple vendors, including Cloudera, Hortonworks, and MapR. Apache has a more complete set of commercial solutions documented.

Available complementary additions to Hadoop include:

Hadoop distributed filesystem ( standard )
The map-reduce architecture ( standard )
Hive, which provides a SQL like interface to the M/R arch
Hbase, a distributed key-value service

Recommended reference sources:

Hive Language Reference

261 questions

votes

3 answers

I/O and RAM limitations are important for Hadoop performance. But is disk speed related to I/O?

Hortonworks says this: "Most often performance of a Hadoop cluster will not be constrained by disk speed – I/O and RAM limitations will be more important." * How is disk speed not related to I/O limitations?

performance memory io hadoop

asked Jun 12 '15 at 05:47

Propulsion

votes

1 answer

store database on hadoop cluster

I'm learning Hadoop and Hive server, and I'm confused about something. Suppose I build a hadoop cluster with three machines, and I start storing images with a PHP/MySQL script. Now for a MySQL database can I install Hive on the same Hadoop server or…

hadoop

asked Jun 03 '15 at 20:37

supto_o

votes

1 answer

Flume- Error Log while using FileChannel

I am using Flume flume-ng-1.5.0 ( with CDH 5.4) to collect logs from many Servers and Sink to HDFS Here is my configuration : #Define Source , Sinks, Channel collector.sources = avro collector.sinks = HadoopOut collector.channels = fileChannel #…

hadoop hdfs cdh4 apache-flume

asked May 08 '15 at 11:39

Summer Nguyen

votes

4 answers

Big Data: Which HD Parameters are Important?

I work with a lot of datasets that are in the tens of GBs, usually split into several files. Performing any type of dataside-wide operation (grep, sed, search, read/write to/from databases and Hadoop) on these files is of course very slow and time…

database filesystems hard-drive hadoop

asked Sep 21 '09 at 21:36

Ryan Rosario

votes

1 answer

Setting up Secure Hadoop Cluster - Kerberos security

I setup a HDP 2.2 cluster successfully (1 NM, 3 DNs and 1 client). User accounts to access HDP cluster are created in client and checked these users can submit jobs, by SSH to client node and run sample jobs. In next step I enabled Kerberos…

kerberos hadoop

asked Feb 12 '15 at 06:09

Krishna Jit

votes

1 answer

Hadoop: How to configure failover time for a datanode

I need to re-replicate blocks on my HDFS cluster in case of a datanode is failing. Actually, this appears to already happen after a period of maybe 10min. However, I want to decrease this time, but wondering how to do so. I tried to set…

failover hadoop hdfs

asked Jan 21 '15 at 13:35

frlan

votes

1 answer

Ambari/Nagios overwrites hadoop-services.cfg file on startup

When I shut down Nagios from the Ambari Web UI, modify the file hadoop-services.cfg, save it and open it, new settings are there. However, when I start the Nagios again (from the Ambari Web UI) and open the file hadoop-services.cfg, changes are…

hadoop

asked Dec 02 '14 at 17:06

Danilo Radenovic

votes

1 answer

Does Active Directory alone is not enough to secure hadoop?

I am trying to secure Hadoop environment installed in windows. So basically I started to analyse how to secure a Unix-based hadoop cluster. Have gone through various links related to Kerberos and other Apache Add-ons(Knox/ Rhino/ Sentry).. Yet to…

active-directory security ldap kerberos hadoop

asked Nov 21 '14 at 09:43

Dinesh Kumar P

votes

1 answer

How to access Hadoop remotely?

I have installed Hadoop on open-stack CentOS guest VM. I'm able to open the site: (From 192.168.0.10, VM-1) http://localhost:50070 http://192.168.0.10:50070 But not able to access the same from a remote machine (My…

openstack hadoop

asked Oct 09 '14 at 19:17

Ibrar Ahmed

votes

1 answer

How to know which script or executable is linked with a metric in ganglia?

I have just started to explore ganglia and my question is "How to know which script or executable is linked with a metric in ganglia?" The fact is that I don't know much about ganglia. I have good experience in zabbix and I want to link a graph in…

hadoop ganglia hbase

asked Aug 28 '14 at 10:18

Rohit

votes

2 answers

Unable to setup a connection to Amazon EC2 and run pig

I have made an EC2 key pair and saved it to a location under my home directory on mac. Also I have changed permissions with 'chmod 600 /path/to/saved/keypair/file.pem'. Now I have followed the following instructions to run pig on EC2: To set up and…

hadoop

asked Aug 03 '14 at 08:42

PronojitS

votes

1 answer

Unable to connect to Amazon EC2 and run pig

amazon-ec2 hadoop

asked Aug 03 '14 at 11:29

PronojitS

votes

0 answers

Distributing Master node ssh key

For the master node to passwordless-ly ssh into the slaves, the master needs to distribute its ssh key to the slaves. Copying key using ssh-copy-id asks for the user password. If there are hundreds of nodes in the system, it may not be a good idea…

linux ssh shell hadoop mapreduce

asked Jul 16 '14 at 01:08

krackoder

votes

2 answers

error while running any hadoop hdfs file system command

I am very new to hadoop and referring to the "hadoop for dummies" book. I have a VM with following specs: hadoop version 2.0.6-alpha bigtop os centos The problem is when I run any hdfs file system command I get following error: hadoop hdfs dfs -ls…

hadoop hdfs

asked Jun 29 '14 at 06:40

Raj Kumar Rai

votes

0 answers

hadoop hbase environment variables

I tried to setup a 4 node Hadoop cluster using CDH4.7. The cluster is up and running fine and when I submit a word count MR job it completed successfully but when I am submitting an MR job to insert data into HBase it was throwing class not found…

hadoop hbase cdh4

asked Jun 26 '14 at 07:39

sunny

Prev 1 2 3

…

17 18 Next