Questions tagged [hadoop]

Hadoop is an open-source solution for providing a distributed/replicated file system, a produciton grade map-reduce system, and has a series of complementary additions like Hive, Pig, and HBase to get more out of a Hadoop-powered cluster.

Hadoop is an Apache foundation sponsored project, with commercial support provided by multiple vendors, including Cloudera, Hortonworks, and MapR. Apache has a more complete set of commercial solutions documented.

Available complementary additions to Hadoop include:

  • Hadoop distributed filesystem ( standard )
  • The map-reduce architecture ( standard )
  • Hive, which provides a SQL like interface to the M/R arch
  • Hbase, a distributed key-value service

Recommended reference sources:

261 questions
1
vote
1 answer

Ambari server exits with no error message in the log

I have downloaded Hortonworks Data Platform 2.3.0 and installed it on Centos7. The installation was successful. When starting the server the following messages are displayed: [root@sparkperf-5360 apps]# ambari-server start Using python …
1
vote
1 answer

Hyper-V VLAN with static IP

Trying to setup/simulate Hadoop cluster locally via multiple (hyper-v) VMs. I'm using hortonworks sandbox image for hyper-v which runs centos linux. I can hit the vm, if I use an internal switch and then setup connection from adapter to this vm, and…
Alwyn
  • 123
  • 3
1
vote
1 answer

Hadoop namenode port getting blocked

I have installed 7 VM instances of Ubuntu 14.04 LTS servers. First instance runs the namenode service and all other 6 nodes run datanode service.I think my NameNode is getting crashed or blocked due to some issue. After rebooting if I check JPS…
1
vote
0 answers

How to force HDFS to use LDAP user's UID

I have a cloudera cluster with HDFS and Hue services and I'm trying to unify the authentication using LDAP. I have my LDAP server running thanks to 389-ds (not sure if is the best way) and I can log into Hue with users from the LDAP server. When I…
Carlos Vega
  • 109
  • 2
  • 3
  • 10
1
vote
1 answer

slaves get a connection timed out with hdfs

I have 3 node instances - master, slave1 and slave2 SSHing between these nodes works fine. Here is the processes that starts on each node when I say…
Prasanna
  • 155
  • 1
  • 6
1
vote
1 answer

Rhadoop hdfs.init() Error

I recently installed CDH5.1.0 along with R 3.1.*, and I got rmr2, rJava, and rhdfs all installed properly. (along with the required packages and set the required environment variables) After some trouble with installing rhdfs I add this to my…
user306603
  • 11
  • 2
1
vote
0 answers

When and how are initial directories created in HDFS

I have a Hadoop setup in which the configured HDFS umask is 027 instead of the default one. Some of the initially created directories have correct permissions (like tmp drwxrwxrwx) but others such as /home are not usable (drwxr-x---). As I'm…
sortega
  • 111
  • 2
1
vote
2 answers

"/usr/bin/env: bash: No such file or directory" during puppet exec command

I am setting my first steps into puppet. I am trying to setup ambari. This is my puppet config: exec { "ambari-repo": command => "curl http://public-repo-1.hortonworks.com/ambari/suse11/1.x/updates/1.4.4.23/ambari.repo >…
cremersstijn
  • 113
  • 1
  • 1
  • 5
1
vote
1 answer

Additional Storage Options for Hadoop HDFS Nodes

We have a small production Cloudera distribution Hadoop cluster(14 nodes, but growing). As we have expanded our usage of this cluster we have found that disk storage is our biggest blocker and requirement. RAM and CPU usage are minimal with our…
Geek42
  • 11
  • 1
1
vote
1 answer

Changing ulimit on ubuntu 12.04 Never works

I am working with hadoop and need to change number of open files ulimit -n. I have seen similar questions on stackoverflow and elsewhere and have tried everything in those answers, but it still does not work. I am working with ubuntu 12.04 LTS. Here…
Ravi Bhatt
  • 166
  • 2
1
vote
1 answer

gitosis interferes with hadoop

Never thought I'd write a title like that, but it's true I have gitosis and hadoop installed. > sudo /usr/lib/hadoop/bin/start-all.sh Enter passphrase for key '/root/.ssh/id_rsa': root@localhost's password: localhost:…
Mamut
  • 181
  • 1
  • 7
1
vote
1 answer

Reverse and Forward DNS set up correctly but sometimes MapReduce job fails

Ever since we switched over our cluster to communicate via private interfaces and created a DNS server with correct forward and reverse lookup zones, we get this message before the M/R job runs: ERROR…
phodamentals
  • 11
  • 1
  • 3
1
vote
0 answers

Use Amazon SNS to send nagios alerts

Is there any way by which I can send nagios alerts to Amazon SNS ? I have tried with following steps, but its giving me this error in the Nagios log file: Jul 12 11:38:23 ip-10-134-13-204 nagios3: Warning: Attempting to execute the command "export…
1
vote
2 answers

hadoop/hive metastore

where do people place their multi user meta store? I'm going to use mysql but I don't know were I should stick it. on the name node or on its own server?
razor
1
vote
1 answer

Unable to read S3 file from interactive pig job flow

I'm unable to read a simple test file on S3 from an interactive pig job flow (hadoop, elastic map reduce), and I'm not sure why. I have two S3 buckets. Let's call them unmounted_bucket, and mounted_bucket. Both of these buckets were initially…