Questions tagged [qubole]

Qubole Data Service (QDS) is cloud Big Data service running on an elastic Hadoop-based cluster

Source Creators of Facebook’s Big Data infrastructure and Apache Hive have leveraged their experience to deliver Qubole Data Service (QDS) – a cloud Big Data service offering the same advanced capabilities used by Big Data savvy organizations.

Minimize operational interaction and provide your data analysts with an easy to use graphical interface, built-in connectors, and seamless, elastic cloud infrastructure.

Your Hadoop cluster is ready within minutes post signup, letting you focus on building sophisticated data pipelines, running queries, scheduling jobs and monetizing your big data.

An auto-scaling cluster, improved I/O optimization, faster queries and support for hybrid pricing - realize cost savings of as much as 50%-60% in total, while accomplishing tasks faster.

87 questions

vote

1 answer

Import csv file into Qubole

I am using qubole to run presto queries. I need to upload a csv file into my query but cannot figure out how to do this. Does anyone have any experience with this? For more details, I am under the analyze section. This is what I have so far…

asked Aug 27 '18 at 15:08

nak5120

4,089
4
35
94

vote

0 answers

IN and NOT IN HiveQL

I am new to HiveQL and is IN and NOT IN supported in it? Especially when using Qubole? Here is my query: SELECT DISTINCT vId FROM table1 WHERE d.columnOne = "123" AND NOT d.columnTwo AND timestamp between 1523550000000 AND 1523930000000 AND NOT…

hive hiveql qubole

asked May 08 '18 at 20:49

noobeerp

vote

1 answer

UDF to generate JSON string behaving inconsistently

I'm trying to generate a JSON string to store a variable number of history records in a single STRING column. The code works on all of my small tests, but fails (no error, just no data) when run on the actual data. Here's what I have: class…

scala apache-spark json4s qubole

asked Apr 23 '18 at 19:34

FrankGT

vote

1 answer

Run Tensorflow in Qubole

I am trying to train LSTM using Spark python Notebook in Qubole. When I try to fit model, I received below error. I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to…

apache-spark tensorflow pyspark qubole

asked Jan 11 '18 at 12:00

GihanDB

vote

1 answer

How to select records from week days?

I have hive table which contain daily records. I want to select record from week days. So i use bellow hive query to do it. I'm using QUBOLE API to do this. SELECT hour(pickup_time), COUNT(passengerid) FROM home_pickup WHERE …

hive qubole

asked Aug 21 '17 at 05:07

GihanDB

vote

1 answer

AWS S3 access issue when using qubole/streamx on AWS EMR

I am using qubole/streamx as a kafka sink connector to consume data in kafka and store them in AWS S3. I created a user in AIM and permission is AmazonS3FullAccess. Then set key ID and key in hdfs-site.xml which dir is assign in…

amazon-s3 apache-kafka apache-kafka-connect qubole

asked Feb 14 '17 at 10:49

Chris Feng

vote

1 answer

pyspark job on qubole fails with "Retrying exception reading mapper output"

I have a pyspark job running via qubole which fails with the following error. Qubole > Shell Command failed, exit code unknown Qubole > 2016-12-03 17:36:53,097 ERROR shellcli.py:231 - run - Retrying exception reading mapper output: (22, 'The…

pyspark qubole

asked Dec 03 '16 at 17:50

Lekha Muraleedharan

vote

1 answer

How do I optimize my hive query for finding Sum of Count of Records from multiple tables

I’ve to generate a report that will give me the sum of the counts from tables A, B and C for events that have been stored using Hive and my S3 buckets have been partitioned by Organization_id For eg: Table A – Has a record for every day John (and…

hadoop amazon-s3 hiveql qubole

asked Mar 30 '16 at 15:48

Ajay

vote

1 answer

Unable to create table in Qubole similar to mysql

I want to create a external table in Qubole similar to a table created in Mysql. Query for create table in mysql is: CREATE TABLE `mytable` ( `id` varchar(50) NOT NULL, `v_count` int(11) DEFAULT NULL, `l_visited` timestamp NOT NULL DEFAULT…

mysql hive qubole

asked Dec 09 '15 at 05:26

Rahul Kumar

vote

2 answers

Autoscaling EMR- is it required? Should I just use EC2? Should I just use Qubole?

In order to reduce the time for provisioning, we've decided to keep up a dedicated EMR cluster with 5 instances (we expect to need about 5). In case we need more, we think we'll need to implement some sort of autoscaling. I'm not familiar at all…

hadoop amazon-web-services emr autoscaling qubole

asked Nov 05 '14 at 00:13

user1136342

4,731
10
30
40

votes

1 answer

Pyspark error- Invalid argument, not a string or column

I have a dataframe in Pyspark - df_all. It has some data and need to do the following count = ceil(df_all.count()/1000000) It gives the following error TypeError: Invalid argument, not a string or column: 0.914914 of type . For…

pyspark qubole

asked Aug 16 '23 at 18:22

user2280352

votes

0 answers

How to view log file in qubole

I would like to retreive the Qubole usage report, but I didnt know where does the data stored, I dont want to download the log file everytime but my aim was to built a table out of it. table of log from each query/scheduler in qubole

scheduler qubole

asked Jul 17 '23 at 07:25

Subhi

votes

0 answers

Qubole Data in hive table returning all the values as null after creating the schema fro Amazon S3

I created the Hive table using the explore going under My Amazon S3. After creating the schema out of it I am able to create the external tables and store it into the Qubole hive explorer under the default. As I move further to query the data in…

apache-spark hive qubole

asked Feb 21 '23 at 00:57

Robin Bhullar

votes

0 answers

Extracting json field from float in Hive using dataset

Quick one guys. I am facing an issue while querying a float JSON Column, as it returns the following error: "Error while compiling statement: FAILED: SemanticException [Error 10014]: line 11:5 Wrong arguments ''$.percentage'': No matching method for…

json hive hiveql qubole

asked Sep 26 '22 at 17:27

Diego

votes

1 answer

Presto Pivoting Data

I am really new to Presto and having trouble pivoting data in it. The method I am using is the following: select distinct location_id, case when role_group = 'IT' then employee_number end as IT_emp_num, case when role_group = 'SC' then…

sql pivot presto qubole

asked Mar 12 '22 at 15:45

llorcs

Prev 1 2

4 5 6 Next