Questions tagged [qubole]

Qubole Data Service (QDS) is cloud Big Data service running on an elastic Hadoop-based cluster

Source Creators of Facebook’s Big Data infrastructure and Apache Hive have leveraged their experience to deliver Qubole Data Service (QDS) – a cloud Big Data service offering the same advanced capabilities used by Big Data savvy organizations.

Minimize operational interaction and provide your data analysts with an easy to use graphical interface, built-in connectors, and seamless, elastic cloud infrastructure.

Your Hadoop cluster is ready within minutes post signup, letting you focus on building sophisticated data pipelines, running queries, scheduling jobs and monetizing your big data.

An auto-scaling cluster, improved I/O optimization, faster queries and support for hybrid pricing - realize cost savings of as much as 50%-60% in total, while accomplishing tasks faster.

87 questions

vote

0 answers

Insert overwrite doesn't delete all the old data files

We are trying to insert overwrite a hive table. Most of the times it's overwriting as expected, i.e deleting any old files and replace new files. We are seeing some inconsistencies with this behavior, once in a while all the old files are not…

asked May 18 '21 at 04:57

Jas

vote

1 answer

Retrieve value in an array of an array with struct

I have a column in Hive table with type: array>> Here is the sample of data in the column: [ [ { "type": "PROFIT", "value": "100", "currency": "USD" }, { …

sql arrays hive hiveql qubole

asked May 06 '21 at 04:15

user1761325

vote

0 answers

Query Qubole data in Python

I'm trying to query Qubole data in Python, but running into some issues. Below is my code: from qds_sdk.qubole import Qubole Qubole.configure(api_token="api_token", api_url="https://us.qubole.com/api") from qds_sdk.commands import…

python qubole

asked Apr 30 '21 at 20:53

BirdPlay6

vote

1 answer

Exclude records with certain values in Qubole

Using Qubole I have Table A (columns in json parsed...) ID Recommendation Decision 1 GOOD GOOD 2 BAD BAD 2 GOOD BAD 3 GOOD BAD 4 BAD GOOD 4 GOOD BAD I…

sql hadoop hive hiveql qubole

asked Nov 24 '20 at 08:00

Kurlito

vote

2 answers

How to connect UiPath to Qubole Hive cluster and run a query

One of the teams using RPA in my company wants to automate reporting that is run in Qubole - Hive environment. The initial approach is to unleash the robot to log in to Okta, then Workbench in Qubole, run the query, and download results. Is there a…

hive rpa uipath qubole

asked Sep 21 '20 at 23:55

Krystian Duda

vote

2 answers

Result-set inconsistency between hive and hive-llap

we are using Hive 3.1.x clusters on HDI 4.0, with 1 being LLAP and another Just HIVE. we've created a managed tables on both the clusters with the row count being 272409. Before merge on both…

hive azure-hdinsight qubole spark-hive

asked Jul 30 '20 at 17:51

Vinay K L

vote

1 answer

Spark Submit Default Command line options

How can we change the parameters in Spark Submit Default Command line options in Qubole. Though there is a option to override the values if needed under "Spark Submit Command Line Options" but this option is not available in Spark "Command Line".

apache-spark command default qubole

asked Apr 02 '20 at 06:51

Throw

vote

1 answer

How to create hive external table with avro file on qubole?

Can someone point in the doc to create external table on qubole base on avro files? CREATE TABLE my_table_name ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT …

hive avro qubole

asked Oct 25 '19 at 01:25

user10714010

vote

2 answers

Implement case class inside a class

I am using the below code to run in Qubole Notebook and the code is running successfully. case class cls_Sch(Id:String, Name:String) class myClass { implicit val sparkSession =…

scala apache-spark apache-spark-sql apache-spark-dataset qubole

asked Jul 10 '19 at 14:17

Sarath Subramanian

20,027
11
82
86

vote

1 answer

Extracting json field from string in Hive using dataset

I am trying a very basic hive query. I am trying to extract a json field from a dataset but I always get \N for the json field, however some_string comes okay Here is my query : WITH dataset AS ( SELECT CAST( '{ "traceId": "abc",…

json hive hiveql qubole

asked May 30 '19 at 18:25

Bhavya Arora

vote

1 answer

Get Qubole data row wise using java

Am trying to run a hive query using Qubole SDK. Though am able to get the desired result as string, in order to better process it, am looking to access this row-wise. Something like a list of java objects. The way am getting the data…

java hive qubole

asked Apr 18 '19 at 10:09

roger_that

9,493
18
66
102

vote

1 answer

Recommendation on Performance optimization for SQL code

I have a code in Qubole that's taking almost 3 hours to execute. I am looking for some recommendations to decrease the code execution time. WITH -- Get latest date - 10 days before as day d AS ( SELECT CAST(CONCAT ( …

sql performance query-optimization qubole

asked Apr 11 '19 at 17:09

Flash

vote

1 answer

Syncing Qubole HIve table to Snowflake with Struct field

I have a table like following Qubole: use dm; CREATE EXTERNAL TABLE IF NOT EXISTS fact ( id string, fact_attr struct< attr1 : String, attr2 : String > ) STORED AS PARQUET LOCATION 's3://my-bucket/DM/fact' I have created…

hive pyspark snowflake-cloud-data-platform qubole

asked Oct 23 '18 at 08:15

Ambrish

3,627
2
27
42

vote

2 answers

Different results when distinct count by different time periods

I am trying to get a count of unique visitors. I first checked it by total without separating it by anytime frame. Main table (big data table sample): +-----------+----+-------+ |theDateTime|vD | vis | +----------------+-------+ |2018-10-03 |123…

sql apache-spark apache-spark-sql bigdata qubole

asked Oct 13 '18 at 18:15

noobeerp

vote

1 answer

Big files causing shuffle error in hadoop map reduce

I am seeing the following error when I try to process big file like size > 35GB files, but doesn't happen when I try less big file like size < 10GB . App > Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in…

java hadoop mapreduce qubole

asked Oct 08 '18 at 18:18

Jal

2,174
1
18
37

Prev 1

3 4 5 6 Next