Questions tagged [cloudera]

Cloudera Inc. is a Palo Alto-based enterprise software company which provides Apache Hadoop-based software and services.

Cloudera, the commercial Hadoop company, develops and distributes Hadoop, the open source software that powers the data processing engines of the world’s largest and most popular websites.

Cloudera's Distribution including Apache Hadoop (CDH) is a free package built from the powerful, flexible, scalable Apache Hadoop software. To help you learn about Hadoop and how to use it, Cloudera offers public and private training, certification and online courseware.

Useful Links

Related Tags

2533 questions
0
votes
0 answers

Identify islands with hierarchy and start and end dates in Cloudera 6.2x Impala

I have a tricky gaps and islands problem. Islands of dates must be identified across a hierarchy of fields. This gaps and islands problem differs from the traditional type in a few ways: The dates are in a range style format in two fields…
0
votes
3 answers

How to count NaN items in Impala query?

I have a table that has 'NaN' in a field that is a double. I simply want to count how many items are 'NaN': Select count(*) from table where col = 'NaN' AnalysisException: operands of type DOUBLE and STRING are not comparable: col = 'NaN' Select…
user3486773
  • 1,174
  • 3
  • 25
  • 50
0
votes
1 answer

Kerberos error while connection to cloudera impala environment

While connection to kerberized hadoop environment error: [Simba]ImpalaJDBCDriver Unable to connect to server: [Simba]ImpalaJDBCDriver Kerberos Authentication failed. I've installed cloudera quickstart vm in virtualbox, enabled kerberos, writing…
0
votes
1 answer

gcloud installation issues in cloudera VM

I am trying to configure gcloud sdk in cloudera VM. Below commands I have used. I have tried to pass python a default parameter in install.sh but stil not working out. Can some one guide me any clean approach. curl https://sdk.cloud.google.com |…
user3858193
  • 1,320
  • 5
  • 18
  • 50
0
votes
1 answer

Unable to insert 5k/sec records into impala?

I am exploring Impala for a POC, however I can't see any significant performance. I can't insert 5000 records/sec, at max I was able to insert mere 200/sec. This is really slow considering any database performance. I tried two different methods but…
gadhvi
  • 97
  • 2
  • 11
0
votes
0 answers

How to add 1 billion rows in Impala?

I am exploring impala for POC ,my task is to add 1 billion rows and check how fast we can insert and retrieve data. I have created a table which has 30 rows half of them are string, 2 are timestamps and rest are integers. Its taking close to 3 hours…
gadhvi
  • 97
  • 2
  • 11
0
votes
2 answers

How to install CM over an existing non CDH Cluster

Is it possible to install CM over an existing non CDH cluster? For example, I have manually installed Hadoop and other services to my VMs. Can I install CM and force it to manage my cluster?
Markiza
  • 444
  • 1
  • 5
  • 18
0
votes
1 answer

Iterative functions for Apache Impala

am working on my graduation project and its using Impala , so i want to ask is there anyway so i can use options like ' for , if , while ' ... etc in Cloudera Impala ?
0
votes
1 answer

Cloudera Sqoop Exception, while it's creating a job by the scoop command

I have an VM: cloudera-quickstart-vm-5.13.0-0-virtualbox, run now. I execute the next command: sudo sqoop --options-file /home/cloudera/sqoop-job/sqoop-job3.txt Test #1: The content of file #1 (sqoop-job3.txt) is the…
HailToTheVictor
  • 388
  • 2
  • 8
  • 28
0
votes
2 answers

Which Cloudera version supports YARN node labels

I would like to know whether Cloudera supports YARN Node labels? if yes, then which version of Cloudera can be used?
Prashant
  • 702
  • 6
  • 21
0
votes
1 answer

How to distribute JDBC jar on Cloudera cluster?

I've just installed a new Spark 2.4 from CSD on my CDH cluster (28 nodes) and am trying to install JDBC driver in order to read data from a database from within Jupyter notebook. I downloaded and copied it on one node to the /jars folder, however it…
michalrudko
  • 1,432
  • 2
  • 16
  • 30
0
votes
0 answers

How to sum up values in Solr4.10.3

I installed Solr from CDH which version is 4.10.3-cdh5.13.0. My job like one sql work like,How to realize by Solr4: select D,E,sum(A),sum(B) from DOC where C='balabala' group by D , E; I try this one,but only one column in group…
Fire
  • 93
  • 9
0
votes
1 answer

What is the cm-api to remove parcels from host and delete the parcels permanently

Cloudera APIs provide a combination of options to Activate and deactivate parcles. What are the commands to remove parcels from host and remove the parcels permanently from CM…
rash1411
  • 55
  • 1
  • 1
  • 6
0
votes
0 answers

The sentry alawys down at 9 p.m

After Capacity expansion, everyday, the sentry(sentry-1.5.1+cdh5.15.2+470) alawys down at 9 p.m, as flow: Caused by: javax.jdo.JDODataStoreException: Iteration request failed : SELECT 'org.apache.sentry.provider.db.service.model.MPath' AS…
ZH-hua
  • 9
  • 2
0
votes
2 answers

kafka console consumer not receiving messages from console producer

I recently setup a cloudera quickstartVM using docker image and setup Kafka parcel in it. After successful installation, i see that all the services are running in green status (including Kafka and zookeeper). However, when I follow the below…
Prashant
  • 702
  • 6
  • 21
1 2 3
99
100