Questions tagged [cloudera-quickstart-vm]

Cloudera QuickStart VM contains a single-node Apache Hadoop cluster including Cloudera Manager, example data, queries, and scripts.

Cloudera QuickStart VM contains a single-node Apache Hadoop cluster including Cloudera Manager, example data, queries, and scripts. It is free software developed by Cloudera.

195 questions
3
votes
2 answers

Why is hadoop mapReduce with python failing but the scripts are working on command line?

I'm trying to implement a simple Hadoop map reduce example using Cloudera 5.5.0 The map & reduce steps should be implemented using Python 2.6.6 Problem: If the scripts are being executed on the unix command line they're working perfectly fine and…
Marco P.
  • 81
  • 5
3
votes
1 answer

Quickstart VM Cloudera parcel won't start

I have a problem understanding something with the Cloudera Quickstart VM. Let me try to explain by outlining my steps so far. I want to write something using Kafka to connect to a web service and ingest a data feed. I'm going to use the Cloudera…
3
votes
2 answers

How to check status of Spark (Standalone) services on cloudera-quickstart-vm?

I am trying to get the status of the services namely spark-master and spark-slaves running on Spark (standalone) service running on my local vm However running sudo service spark-master status is not working. Can anybody provide some hints on how to…
somnathchakrabarti
  • 3,026
  • 10
  • 69
  • 92
3
votes
4 answers

Cloudera Hue Web UI Default password

I have downloaded Cloudera CDH 5.3 recently and now i am in need to access the HUE Web UI Portal. When i give the default username and password which belong to Cloudera admin/admin it is not working. I am unable to login to the HUE portal now. Can…
3
votes
2 answers

Why does dropna() not work?

System: Spark 1.3.0 (Anaconda Python dist.) on Cloudera Quickstart VM 5.4 Here's a Spark DataFrame: from pyspark.sql import SQLContext from pyspark.sql.types import * sqlContext = SQLContext(sc) data = sc.parallelize([('Foo',41,'US',3), …
Jason
  • 2,834
  • 6
  • 31
  • 35
2
votes
2 answers

Not able to download Cloudera

I am trying to find a link to download cloudera zip file on VMWare , but unable to get any. Tried searching on google , on cloudera website , but in vain. Can somebody share some views on it.
2
votes
0 answers

Sqoop Fail In Hue Workflow

When the following sqoop import is run in command shell works well. import --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" --username retail_dba --password cloudera -m 1 --table categories --hive-database retail_stage --hive-table…
Yunus Einsteinium
  • 1,102
  • 4
  • 21
  • 55
2
votes
0 answers

Install spark on cloudera vm

I tried to install ipython 1.2.1 using this command: sudo easy_install ipython==1.2.1 but it failed with: No local packages or download links found for ipython==1.2.1 error: Could not find suitable distribution for …
Dr. know
  • 107
  • 2
  • 4
2
votes
1 answer

Zookeeper running or not in relation to standard port 2181 usage?

CLOUDERA QUICKSTART 5.13 as follows. I am not sure whether zookeeper out of the box is running or not, and if so, then if it would work reliably? I got this when trying to run zookeeper from within the from kafka supplied version that I downloaded,…
thebluephantom
  • 16,458
  • 8
  • 40
  • 83
2
votes
1 answer

Failed to connect to server: quickstart.cloudera/10.0.2.15:8032

[cloudera@quickstart ~]$ sqoop import -connect jdbc:mysql://localhost/test -username root -P -table transactions -m 1 When executing the above command, I get thefollowing exception. Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo…
Jagadeesh
  • 239
  • 5
  • 9
2
votes
1 answer

Loading data into Hive Table from HDFS in Cloudera VM

When using the Cloudera VM how can you access information in the HDFS? I know there isn't a direct path to the HDFS but I also don't see how to dynamically access it. After creating a Hive Table through the Hive CLI I attempted to load some data…
Angel Lockhart
  • 157
  • 1
  • 2
  • 11
2
votes
2 answers

Spark-submit create only 1 executor when pyspark interactive shell create 4 (both using yarn-client)

I'm using the quickstart cloudera VM (CDH 5.10.1) with Pyspark (1.6.0) and Yarn (MR2 Included) to aggregate numerical data per hour. I've got 1 CPU with 4 cores and 32 Go of RAM. I've got a file named aggregate.py but until today I never submitted…
2
votes
1 answer

how to integrate cloudera apache sentry with open ldap

I have LDAP in my CDH 5.10 quick start VM for development and I have started the Sentry service within that. Now I want to integrate Apache Sentry with LDAP. Please let me know if that is even possible and if yes please guide me through the…
sachingupta
  • 709
  • 2
  • 9
  • 30
2
votes
0 answers

Cloudera VM Insufficient space for shared memory file

I am getting below error while starting the Hive in cloudera VM CDH 5.10: Java HotSpot(TM) 64-Bit Server VM warning: Insufficient space for shared memory file: /tmp/hsperfdata_cloudera/26270 Try using the -Djava.io.tmpdir= option to select an…
Kannan Kandasamy
  • 13,405
  • 3
  • 25
  • 38
2
votes
1 answer

Oozie simple ssh job failing : AUTH_FAILED: Not able to perform operation

I am trying simple ssh job using Cloudera oozie.…
Atish
  • 4,277
  • 2
  • 24
  • 32
1
2
3
12 13