Questions tagged [shark-sql]

Shark has been subsumed by Spark SQL. It was an open source distributed SQL query engine for Hadoop data. It brings state-of-the-art performance and advanced analytics to Hive users.

Shark has been subsumed by . It was an open source distributed SQL query engine for Hadoop data. It brings state-of-the-art performance and advanced analytics to Hive users.

59 questions
0
votes
1 answer

Shark integration with datastax enterprise 4.0.3 cassandra

I'm trying to integrate Shark 0.9.1 (for hadoop 1) with hive on datastax enterprise 4.0.3 Hadoop node. I have successfully intsall and configure Scala 2.10.3 and Spark 1.0.0. Scala and sparks shells are also working fine. Now when i'm trying open…
user3632180
  • 105
  • 2
  • 13
0
votes
1 answer

Querying Cassandra using Shark takes too much time

I have set-up a two(2) node Cassandra cluster and trying to perform queries using shark. But it takes around 10 minutes for a query. But the query works fine. (I used Cloudera to install the software for me) Time taken: 421.189 seconds shark> I…
Tharanga
  • 115
  • 2
  • 2
  • 7
0
votes
0 answers

Exception when trying to execute SQL using Apache Shark

I'm trying to use a hive metastore with shark-0.9.1 (hive-0.11.0). For now, I'd be happy getting it running on a single node, so no slavery involved. When running hive, I can create tables and execute SQL statements such as hive> SELECT MAX(rating)…
helm
  • 713
  • 2
  • 16
  • 30
0
votes
1 answer

Big data handling using cassandra in real time

I am developing an application for sales force. I am not able to figure out how to manage big data in my application. Following are the scenarios. I have location divided based on following criteria. Country => State => City => Territory => Area…
0
votes
1 answer

Invalid cache type exception in Shark

I am trying to create a cached table in shark-0.8.0. As per the documentation (https://github.com/amplab/shark/wiki/Shark-User-Guide) , I created table as follows: CREATE TABLE mydata_cached ( artist string, title string , track_id string, …
visakh
  • 2,503
  • 8
  • 29
  • 55
0
votes
1 answer

Limit number or rows from JOIN

I am trying to JOIN both tables ON scores.updated_at_yyyy_mm = distributions.range_yyyy_mm which of course works, but also LIMIT the number of rows returned from the scores table according to 'count' given in the distributions table, which…
juwalter
  • 11,472
  • 5
  • 19
  • 18
0
votes
2 answers

Create table joining two existing tables in Shark Hive

I have two tables oldTable and newTable with the contents as : oldTable : key value volume ====================== 1 abc 10000 2 def 5000 newTable : key value volume ====================== 1 abc …
gaganbm
  • 2,663
  • 3
  • 24
  • 36
0
votes
1 answer

AMPLab Shark on Apache Spark

As per documentation, "Apache Spark is a fast and general engine for large-scale data processing." "Shark is an open source distributed SQL query engine for Hadoop data." And Shark uses Spark as a dependency. My question is, Is Spark just parses…
Murali Mopuru
  • 6,086
  • 5
  • 33
  • 51
0
votes
1 answer

FAILED: Hive Internal Error: java.util.NoSuchElementException(null) while running a CREATE TABLE query from shark command line

I am trying to create a table in hive metastore using shark by executing the following command: CREATE TABLE src(key int, value string); but i always get: FAILED: Hive Internal Error: java.util.NoSuchElementException(null) Read about the same thing…
ravihemnani
  • 177
  • 2
  • 10
0
votes
1 answer

Query through Shark API not working

I am trying to make a query(a simple select) through Shark Java API from a Hive table on a cluster. However I get this error message: 14/01/15 17:25:54 INFO cluster.ClusterTaskSetManager: Loss was due to…
Radu C.
  • 11
  • 2
0
votes
1 answer

Unable to recover partitions in Shark for Hive table with S3 location

I'm trying to use Shark on EMR and I can't seem to be able to recover my partitions from a table with location set to an S3 bucket. I get nothing when i try to show my partitions. shark> MSCK REPAIR TABLE logs ; OK Time taken: 1.79 seconds shark>…
Mattias
  • 158
  • 2
  • 9
0
votes
2 answers

Integrate Play framrework with Berkeley Shark

I am trying to connect from a Plat 2.0.8 based Scala application to a Berkeley Shark context to fetch data from Shark tables. Can you please tell me how to do this. The Spark documentation is sparse. Thanks
-1
votes
1 answer

Amplab shark with HBase

What is a good way to set up access to a HBase table through shark queries? I explored some articles which are geared towards setting up HBase with Hive such as https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration , but not too sure how…
DaTaBomB
  • 623
  • 3
  • 11
  • 23
-2
votes
1 answer

Which is better in term of speed, Shark or spark

I am very confusing about this two.I know shark is same as hive with 100x faster, work on spark. I want to know main difference between spark and shark. Which is better mean faster. When I have to use spark or when shark?????
lucy s
  • 45
  • 1
  • 1
  • 5
1 2 3
4