Questions tagged [shark-sql]

Shark has been subsumed by Spark SQL. It was an open source distributed SQL query engine for Hadoop data. It brings state-of-the-art performance and advanced analytics to Hive users.

Shark has been subsumed by . It was an open source distributed SQL query engine for Hadoop data. It brings state-of-the-art performance and advanced analytics to Hive users.

59 questions
1
vote
1 answer

Scala Spark / Shark: How to access existing Hive tables in Hortonworks?

I am trying to find some docs / description of the approach on the subject, please help. I have Hadoop 2.2.0 from Hortonworks installed with some existing Hive tables I need to query. Hive SQL works extremly and unreasonably slow on single node…
DarqMoth
  • 603
  • 1
  • 13
  • 31
1
vote
1 answer

Apache Shark 0.9.1 can't connect to HDFS?

In Shark, when I run: CREATE EXTERNAL TABLE test ( memberId STRING, category STRING, message STRING, source STRING, event_type STRING, log_level STRING, path STRING, host STRING, event_timestamp STRING, eventFields…
poliu2s
  • 657
  • 1
  • 10
  • 30
1
vote
1 answer

Shark getting started: all queries hanging

I am a noobie for sharkle - though I do have some experience with spark. Every attempt being made to retrieve data from shark is hanging. As a preliminary step: let's ensure that spark were up and healthy: spark> val tf =…
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
1
vote
1 answer

Installing a Spark Cluster, problems with Hive

I am trying to get a Spark/Shark cluster up but keep running into the same problem. I have followed the instructions on https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster and addressed Hive as stated. I think that the Shark Driver is…
slotishtype
  • 2,715
  • 7
  • 32
  • 47
1
vote
2 answers

Getting IncompatibleClassChangeError while running shark-0.9.0 with hadoop 2.2.0

I am getting the following error while running shark 0.9.0. Exception in thread "main" java.lang.IncompatibleClassChangeError: Found class scala.collection.mutable.ArrayOps, but interface was expected at…
Subhradip Bose
  • 3,065
  • 2
  • 13
  • 17
1
vote
1 answer

How to sbt Shark API (sql2rdd) into Spark Interactive Shell

As a linux noob, I recently set up the spark and shark to play around. There is an API sql2rdd that I want to use to pull data from shark in to rdd. However, I don't know where is the sql2rdd library is and how to link with the Spark Interactive…
1
vote
1 answer

Run Queries with Apache SHARK on Mac OSX

I am having trouble running queries with Shark locally on Mac OSX 10.8. I am trying to run some test queries on data stored in Hive. I am using Scala 2.9.3 and Hive 0.9.0 and both seem to be running fine. The Hive database is using MySQL to store…
DJElbow
  • 3,345
  • 11
  • 41
  • 52
1
vote
1 answer

How do hive and drill integrate?

Drill looks like an interesting tool for the ad-hoc drill down queries as opposed to the high-latency Hive. It seems that there should be a decent integration between those two but i couldn't find it. Lets assume that today all of my work is done on…
dimamah
  • 2,883
  • 18
  • 31
0
votes
0 answers

shark: FAILED: Error in semantic analysis: Exactly one argument is expected

I am using the APPROX_SUM query after I created a table of 5000 rows of random integers less than 1000. It always results in an exception Exactly one argument is expected. But I am using only one column with only integers as described below.…
bbh
  • 1
0
votes
1 answer

java HiveClient fails select: java.sql.SQLException: Query returned non-zero code: 9

I'm pretty new to Hive and HDFS, but I have managed to make a functioning HiveClient in java, that successfully connects and performs queries on my HDFS server.That is, all queries except select statements. My code looks like this: Statement…
Elin
  • 69
  • 1
  • 7
0
votes
2 answers

Does Spark support insert overwrite static partitions?

I noticed in the current Spark Sql manual that inserting into a dynamic partition is not supported: Major Hive Features Spark SQL does not currently support inserting to tables using dynamic partitioning. However, is insert/overwriting into static…
JeffLL
  • 1,875
  • 3
  • 19
  • 30
0
votes
1 answer

Can we use Shark 0.9.1 version with Spark 1.1.0?

I know Shark has been subsumed by Spark SQL, a new module in Apache Spark. But my question is, can we use the existing Shark with new Spark versions ?
Devan M S
  • 692
  • 9
  • 23
0
votes
1 answer

hive internal error with Amplab shark on spark

Please... Help needed. I have followed steps to build spark and shark to query data from hdfs/cassandra. I have a cassandra cluster on hdfs and can successfully view database. But can not run a select statement shark> select * from calls_flow limit…
del
  • 199
  • 1
  • 1
  • 7
0
votes
1 answer

Running query from Amplab-shark to cassandra on hdfs

Please help needed for Amplab-Shark query on cassandra in hdfs. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/task/JobContextImpl I can successfully run: use database show tables; etc. But can not run any…
del
  • 199
  • 1
  • 1
  • 7
0
votes
1 answer

Why does Shark running on EC2 give me a "Wrong FS" error when writing data to S3

I am running Shark/Spark (0.9.1) on Amazon EC2 using the supplied setup scripts. I am reading data out of S3 and then trying to write back a table into S3. The data can be read from S3 fine (so my credentials are correct) but when I try to write…
gallamine
  • 865
  • 2
  • 12
  • 26