AMPLab Shark on Apache Spark

Question

As per documentation,

"Apache Spark is a fast and general engine for large-scale data processing."

"Shark is an open source distributed SQL query engine for Hadoop data."

And Shark uses Spark as a dependency.

My question is, Is Spark just parses HiveQL into Spark jobs or does anything great if we use Shark for fast response on analytical queries ?

so..... what is your question? it's not clearly stated. – Donald Miner Feb 27 '14 at 18:27 — Donald Miner, Feb 27 '14 at 18:27

Viacheslav Rodionov · Accepted Answer · 2014-02-27T19:50:33.887

3

Yes, Shark uses the same idea as Hive but translates HiveQL into Spark jobs instead of MapReduce jobs. Please, read pages 13-14 of this document for architectural differences between these two.

edited Feb 27 '14 at 19:50

answered Feb 27 '14 at 19:44

Viacheslav Rodionov

2,335
21
22

AMPLab Shark on Apache Spark

1 Answers1