Questions tagged [shark-sql]

Shark has been subsumed by Spark SQL. It was an open source distributed SQL query engine for Hadoop data. It brings state-of-the-art performance and advanced analytics to Hive users.

Shark has been subsumed by apache-spark-sql. It was an open source distributed SQL query engine for Hadoop data. It brings state-of-the-art performance and advanced analytics to Hive users.

59 questions

votes

2 answers

Are there any python or scala tools to connect the spark/shark

I want to use python or scala to connect shark server. But I didn't find any tools to do this. Are there any libs(python or scala/java). Thanks advanced.

asked Oct 12 '13 at 08:10

Joey.Chang

vote

1 answer

Return Boolean (1 or 0) if table contains duplicate rows

I wish to return a boolean value if there are duplicates in the table in Hive 0.9 For now, I'm doing this : select cast(case when count(*) > 0 then 1 else 0 end as smallint) Validate_Value from ( select guid, count(guid) cnt from…

hadoop apache-spark hive hiveql shark-sql

asked Sep 09 '15 at 22:43

underwood

vote

1 answer

SPARK - How to use function in group by query

I am going to migrate SHARK query into SPARK . Below is my sample SHARK query which use function in group by clause. select month(dt_cr) as Month, day(dt_cr) as date_of_created, count(distinct phone_number) as total_customers …

apache-spark shark-sql

asked Jan 08 '15 at 13:09

sandip

vote

1 answer

How to create a Shark query from a saved text file out of a RDD?

I have a JavaPairRDD results and I save it by calling: results.saveAsTextFile("data") Then I get files content like: (www.abc.com,0.15712321 www.def.com,www.aaa.com,www.ccc.com) Now, I want to create a table with three fields using…

apache-spark shark-sql apache-spark-sql

asked Sep 28 '14 at 15:26

MatrixZ

vote

1 answer

How can I get Spark/Shark to start on DSE 4.5.1

This was initially working out of the box and then AWS kindly shut down this server for me. So I rebuilt it and made it the new job tracker (it was also the old job tracker). Now I can't figure out how to get Spark/Shark to run. I get the same…

apache-spark datastax-enterprise shark-sql

asked Sep 04 '14 at 14:42

Eric Lubow

vote

1 answer

Can someone explain this : "Spark SQL supports a different use case than Hive."

I am referring to the following link : Hive Support for Spark It says : "Spark SQL supports a different use case than Hive." I am not sure why that will be the case. Does this mean as a Hive user i cannot use Spark execution engine through Spark…

hadoop hive apache-spark shark-sql

asked Aug 27 '14 at 18:35

Venkat

1,810
1
11
14

vote

1 answer

Shark external table performance

How does querying from an external table in Shark located on the local filesystem compare to using data located on HDFS in terms of query performance? I plan to use a single high end server for running shark queries and was wondering if its…

bigdata apache-spark shark-sql

asked Aug 12 '14 at 21:12

DaTaBomB

vote

1 answer

JDBC connection to Shark Server hangs

I am using following configuration for my shark cluster Scala 2.10.3 Spark 0.9.0 Hive 0.12.0-chd5.0.2 Shark 0.9.0 Spark and Hive are configured via Cloudera manager (CDH 5.0.2) I am following this tutorial to connect to shark…

hadoop apache-spark hive shark-sql

asked Jul 21 '14 at 05:17

Junaid

vote

1 answer

which Hadoop component can handle all the oracle queries.?

Which hadoop component can handle all the oracle functions & which has low latency.. Am thinking to use the components like Presto, Drill and Shark.. Can anyone tell which of the above technology can handle all the functions in oracle with low…

oracle hadoop shark-sql presto

asked Jul 09 '14 at 18:42

Pavan Chakravarthy

vote

0 answers

java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat

I have followed this link for installing shark on CDH5. I have installed it but as it also mentioned on the above block:- This -skipRddReload is only needed when you have some table with hive/hbase mapping, because of some issus in…

hive hbase apache-spark shark-sql

asked Jun 16 '14 at 09:45

Aashu

1,247
1
26
41

vote

0 answers

How to convert Spark's TableRDD to RDD[Array[Double]] in Scala?

I am trying to perform Scala operation on Shark. I am creating an RDD as follows: val tmp: shark.api.TableRDD = sc.sql2rdd("select duration from test") I need it to convert it to RDD[Array[Double]]. I tried toArray, but it doesn't seem to work. I…

scala apache-spark shark-sql

asked Jun 13 '14 at 13:08

visakh

2,503
8
29
55

vote

2 answers

installing apache shark in stand alone mode result in scala error

I'm basicallly following the guide on https://github.com/amplab/shark/wiki/Running-Shark-Locally. I downloaded scala I'm using ec2 amazon linux my shark/shark-0.8.0/conf/shark-env.sh configuration file look like this export SPARK_MEM=1g export…

scala amazon-ec2 apache-spark shark-sql

asked Jun 11 '14 at 22:44

user2773013

3,102
8
38
58

vote

1 answer

installing HDFS for use with SHARK without YARN

I'm trying to install Apache Shark. One of the requirement is to have HDFS installed. I don't want to use YARN or MESOS. I just want HDFS. My question is: Does this mean I can only install hadoop distribution prior to 2.x? If so, which one? or can…

hadoop hdfs apache-spark shark-sql

asked Jun 11 '14 at 01:20

user2773013

3,102
8
38
58

vote

0 answers

Error in Configuring Spark/Shark on DSE

, I have installed 1) scala-2.10.3 2) spark-1.0.0 Changed spark-env.sh with below variables export SCALA_HOME=$HOME/scala-2.10.3 export SPARK_WORKER_MEMORY=16g I can see Spark master. 3) shark-0.9.1-bin-hadoop1 Changed shark-env.sh with below…

cassandra hive datastax-enterprise shark-sql metastore

asked Jun 10 '14 at 12:28

user3632180

vote

0 answers

Issue with loading data into Parquet table from a JSON Serde based Hive table

I have a HIVE table defined using a JSON Serde. I'm using the Shark distribution (http://shark.cs.berkeley.edu/). The definition is as follows: CREATE TABLE lastfm( artist string, title string , track_id string, similars array>, tags…

hadoop hive shark-sql parquet

asked May 20 '14 at 07:00

visakh

2,503
8
29
55

Prev 1

3 4 Next