Questions tagged [cloudera-quickstart-vm]

Cloudera QuickStart VM contains a single-node Apache Hadoop cluster including Cloudera Manager, example data, queries, and scripts.

Cloudera QuickStart VM contains a single-node Apache Hadoop cluster including Cloudera Manager, example data, queries, and scripts. It is free software developed by Cloudera.

195 questions
2
votes
1 answer

How to Execute MapReduce Job/JAR with Cloudera Quickstart Docker container

I need some help regarding how to run a MapReduce Program/Job with Cloudera Docker Container. I am using a Linux (ElementaryOS) high config. laptop (24GB RAM, i7 Processor). I am able to install Cloudera docker image, ran it and also did the…
2
votes
2 answers

Run spark python job with Oozie and Hue - Intercepting System.exit(1)

I have to run some Spark python scripts as Oozie workflows, I've tested the scripts locally with Spark but when I submit them to Oozie I can't figure out why is not working. I'm using the Cloudera VM, and I'm managing Oozie with the Hue dashboard.…
2
votes
1 answer

Create a Hadoop cluster using cloudera quickstartVM errors

I want to create a Cloudera cluster using the quickstart VM image which you can directly download from cloudera´s web page (http://www.cloudera.com/downloads/quickstart_vms/5-8.html). I have three virtual machines, I would like to have one master…
2
votes
2 answers

Setting up AWS Credentials - Cloudera Quickstart Docker Container

I am trying to use Cloudera's Quickstart docker container to test simple Hadoop/Hive jobs. I want to be able to run jobs on data in S3, but so far am having problems. I have added the below properties to core-site.xml, hive-site.xml,…
DJElbow
  • 3,345
  • 11
  • 41
  • 52
2
votes
0 answers

Unable to establish connection between Impala and Rstudio using rimpala.connect()

I am unable to establish connection between Impala and RStudio. I am using Cloudera quickstart vm for Cloudera Manager and RStudio Please see code below and advise if anything could be…
Enno Victor
  • 41
  • 1
  • 2
  • 7
2
votes
1 answer

Compress Json data in hive external table, at the time querying throwing exception?

I have created external tables by following below steps Hive > ADD JAR /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar; Hive > set hive.exec.compress.output=true; Hive > set mapred.output.compress=true; Hive> set…
Sai
  • 1,075
  • 5
  • 31
  • 58
2
votes
1 answer

How to check if the cloudera services like hive, Impala are running or not through java code?

I want to run some hive queries, and then need to collect different metrics like hdfs bytes read/write. For this I have written java code. But before running the code I just want to check if the cloudera services like hive, impala, yarn are running…
2
votes
2 answers

Hue configuration error -/etc/hue/conf.empty - Potential misconfiguration detected

Hi Experts, I'm newbie to Hadoop , linux environment and Cloudera. I installed cloudera vm 5.7 on my machine and imported mysql data to hdfs using SQOOP. I'm trying to execute to some queries against this data using impala. So, I tried launching…
samy
  • 65
  • 1
  • 8
2
votes
1 answer

best practice to load multiple client data into Hadoop

We are creating POC on Hadoop framework with Cloudera CDH. We want to load data of multiple client into Hive tables. As of now, we have separate database for each client on SQL Server. This infrastructure will remain same for OLTP. Hadoop will be…
107
  • 552
  • 3
  • 26
2
votes
1 answer

Debug Apache Slider package?

I went through the Slider Memcached Tutorial and was able to package/deploy/start the memcached container successfully; however when I package up a custom application, basically a Java jar plus dependencies, the container never launches…
dr3x
  • 917
  • 2
  • 14
  • 26
2
votes
2 answers

Passwords for Cloudera Quickstart VM users

I wonder where can I see the passwords for different user accounts in Cloudera Quickstart VM, like yarn, hdfs user, etc.? I am using version 5.4.0.
oikonomiyaki
  • 7,691
  • 15
  • 62
  • 101
2
votes
1 answer

"KeyError: 'SPARK_HOME' ", "can't load main class from JAR" in running PySpark as an Oozie workflow job

This issue is a continuation of my previous question here, which was seemingly resolved but leads to here as another issue. I am using Spark 1.4.0 on Cloudera QuickstartVM CHD-5.4.0. When I run my PySpark script as a SparkAction in Oozie, I…
oikonomiyaki
  • 7,691
  • 15
  • 62
  • 101
2
votes
1 answer

Unknown host exception when using spring data hadoop to connect to Cloudera QuickStart VM Hbase

I use QuickStart VMs for CDH 5.3.x I try to implement this spring hadoop sample for hbase The sample from the host computer will connect to Hbase in the VM to create table, add data, read data. In my pom i use
SieuCau
  • 195
  • 1
  • 2
  • 15
1
vote
1 answer

Installing Cloudera Quick start VM on M1 macOs

Currently I am learning Hadoop. Previously I used lab where I can access the Hadoop ecosystem. Recently I got M1 Mac and I want to run the same through Cloudera quick start VM. I do know that it can run in Intel based macOS so, is it possible to run…
1
vote
1 answer

Where is the hive-site.xml in Cloudera distribution?

I would like to know where the hive-site.xml file configuration is in a Cloudera distribution. Mainly because I would like to know where I can find out properties…
1 2
3
12 13