Questions tagged [hadoop]

Hadoop is an open-source solution for providing a distributed/replicated file system, a produciton grade map-reduce system, and has a series of complementary additions like Hive, Pig, and HBase to get more out of a Hadoop-powered cluster.

Hadoop is an Apache foundation sponsored project, with commercial support provided by multiple vendors, including Cloudera, Hortonworks, and MapR. Apache has a more complete set of commercial solutions documented.

Available complementary additions to Hadoop include:

  • Hadoop distributed filesystem ( standard )
  • The map-reduce architecture ( standard )
  • Hive, which provides a SQL like interface to the M/R arch
  • Hbase, a distributed key-value service

Recommended reference sources:

261 questions
0
votes
1 answer

Best practices - Conditionals for Chef resources without guard attributes?

We are setting up a cluster using Apache Ambari. Our Chef run is interrupted by the need to use Ambari to provision the Hadoop cluster. Current installs are a three part process: Initial Chef run to prep OS. Use Ambari to configure (and later…
invict_us
  • 51
  • 1
  • 6
0
votes
1 answer

HDFS' ZKFC service unable to start

CDH4's ZooKeeper Failover Controller (ZKFC) has been installed. Starting the ZKFC service: [vagrant@localhost ~]$ sudo service hadoop-hdfs-zkfc start Starting Hadoop zkfc: [ OK ] starting…
030
  • 5,901
  • 13
  • 68
  • 110
0
votes
0 answers

Can I change my hive external table storage format without affecting its partitions?

I want to change my external table storage format from text file to ORC file. I already added partitions around 500 for the external table stored as 'TEXTFILE'. If I simply create another external table with 'ORC file' format, then i have to create…
dheee
  • 111
  • 1
0
votes
0 answers

How to start hadoop on CentOS

I have installed hadoop using Yum on CentOS: yum install hadoop Now I want to start hadoop but cannot find the start-all.sh script mentioned on many blogs and wiki pages such as http://wiki.apache.org/hadoop/GettingStartedWithHadoop It looks like…
cremersstijn
  • 113
  • 1
  • 1
  • 5
0
votes
1 answer

Cannot print hdfs topology when using viewfs:/// as fs.defaultFS

I am using Rack Awareness technology in HDFS, I can use the following command to get topology: hdfs dfsadmin -printTopology Today, after setting up HDFS Federation and use viewfs instead of hdfs as defaultFS like this: