Are there any use cases where hadoop map-reduce can do better than apache spark?

Question

I agree that iterative and interactive programming paradigms are very good with spark than map-reduce. And I also agree that we can use HDFS or any hadoop data store like HBase as a storage layer for Spark.

Therefore, my question is - Do we have any use cases in real world that can say hadoop MR is better than apache spark on those contexts. Here "Better" is used in terms of performance, throughput, latency. Is hadoop MR is still the good one to do BATCH processing than using spark.

If so, Can any one please tell the advantages of hadoop MR over apache spark? Please keep the entire scope of discussion with respect to COMPUTATION LAYER.

score 0 · Answer 1 · answered Aug 04 '15 at 08:08

As you said, in iterativeand interactive programming, the spark is better than hadoop. But spark has a huge need to the memory, if the memory is not enough, it would throw the OOM exception easily, hadoop can deal the situation very well, because hadoop has a good fault tolerant Mechanism.

Secondly, if Data Tilt happened, spark maybe also collapse. I compare the spark and hadoop on the system robustness, because this would decide the success of job.

Recently I test the spark and hadoop performance use some benchmark, according to the result, the spark performance is not better than hadoop on some load, e.g. kmeans, pagerank. Maybe the memory is a limitation to spark.

Thanks gwgyk for the quick responce. It would be great if you can update the answer when ever you found any new insight. It might help all of us to gain that insight. — Jagadish Talluri, Aug 20 '15 at 06:34

Are there any use cases where hadoop map-reduce can do better than apache spark?

1 Answers1