4

As I know ,Spark preload the data from every nodes' disk(HDFS) into every nodes' RDD to compute. But as I guess, MapReduce must also load the data from HDFS to memory and then compute it in memory. So.. why is Spark more faseter? Just because MapReduce load the data to memory at every time when MapReduce want to do the compute but Spark preload the data? Thank you very much.

DunkOnly
  • 1,682
  • 4
  • 17
  • 39
  • possible duplicate of [Is caching the only advantage of spark over map-reduce?](http://stackoverflow.com/questions/24705724/is-caching-the-only-advantage-of-spark-over-map-reduce) – aaronman Aug 19 '14 at 13:42
  • maybe you can find the answer from here https://stackoverflow.com/questions/32572529/why-is-spark-faster-than-hadoop-map-reduce – Oshan Wisumperuma Jul 31 '18 at 05:27

2 Answers2

0

There is a concept of an Resilient Distributed Dataset (RDD), which Spark uses, it allows to transparently store data on memory and persist it to disc when needed.

On other hand in Map reduce after Map and reduce tasks data will be shuffled and sorted (synchronisation barrier) and written to disk.

In Spark, there is no synchronisation barrier that slows map-reduce down. And the usage of memory makes the execution engine really fast.

  • There is still a shuffle in spark, and actually I believe they only recently implemented in memory shuffle recently, before it also created files – aaronman Aug 19 '14 at 13:55
  • check out [this paper](http://www.cs.berkeley.edu/~kubitron/courses/cs262a-F13/projects/reports/project16_report.pdf) for some examples of spark's issues with shuffle, it's actually worse than Hadoop's for the time being – aaronman Aug 19 '14 at 14:26
0

Hadoop Map Reduce

  1. Hadoop Map Reduce is Batch Processing

2.In HDFS high latency. Here is a full explanation about Hadoop MapReduce and Spark

http://commandstech.com/basic-difference-between-spark-and-map-reduce-with-examples/

Spark:

  1. Coming to Spark is Streaming processing

  2. Low latency because of RDDs.

Spandana r
  • 213
  • 2
  • 3