MapReduce alternatives

Question

Are there any alternative paradigms to MapReduce (Google, Hadoop)? Is there any other reasonable way how to split & merge big problems?

@ralu: There are many ways how to deal with big problems. MapReduce DEFINITELY is only one of them and it DEFINITELY is both paradigm and algorithm. Also its implementation becomes technology, but I am not interested in implementations rather ideas. Thank you. — Cartesius00, Jan 01 '12 at 11:17
Why do you think about your problem as split and merge. You just need to solve problem. For instance Apache Pig deals whit data using SQL like language. And there is no split and merge way of thinking although it can run on cluster of hundreds machines and uses Hadoop as platform. — Luka Rahne, Jan 01 '12 at 12:30
@ralu: Hive has the SQL like syntax. The Pig syntax is completely different. — Niels Basjes, Jan 01 '12 at 12:46
@ralu: I am looking for ideas, you're completely on another level of implementation. — Cartesius00, Jan 01 '12 at 13:57
@Niels Basjes you are right, but my point is in view of problem. If problem can be expressed in split/merge map-reduce is way to go, because it was made for this kind of things. Point is that you need something that is easy to express problem whit which can be later run on computational device. Cluster is just computational device, and optimization is compiler/framework problem. Unfortunately, most of them are still pretty dumb. — Luka Rahne, Jan 01 '12 at 17:56

Nicolas78 · Accepted Answer · 2014-01-26T17:06:43.333

13

Definitively. Check out, for example, Bulk Synchronous Parallel. Map/Reduce is in fact a very restricted way of reducing problems, however that restriction makes it manageable in a framework like Hadoop. The question is if it is less trouble to press your problem into a Map/Reduce setting, or if its easier to create a domain-specific parallelization scheme and having to take care of all the implementation details yourself. Pig, in fact, is only an abstraction layer on top of Hadoop which automates many standard problem transformations from not-Map-Reduce-y to Map-Reduce-compatible.

Edit 26.1.13: Found a nice up-to-date overview here

edited Jan 26 '14 at 17:06

answered Jan 01 '12 at 16:13

Nicolas78

5,124
1
23
41

3

[Apache Hama](http://incubator.apache.org/hama/) implements BSP. Hama has been ported to [YARN (Yet Another Resource Manager)](http://wiki.apache.org/hama/GettingStartedYARN) which is part of Hadoop 0.23. Check this [blog](http://codingwiththomas.blogspot.com/) on Apache Hama. – Praveen Sripati Jan 01 '12 at 17:10
Thanks Praveen ;) Please visit our website and wiki for more information about hama http://incubator.apache.org/hama/ – Thomas Jungblut Jan 02 '12 at 18:05

Pete Kirkham · Answer 2 · 2021-05-18T15:48:21.697

Phil Colella identified seven numerical methods for scientific computation based on the patterns of scattering and gathering of data between processing nodes, and called them 'dwarfs'. These have been added to by others, a list is available at the Dwarf Mine:

Dense Linear Algebra
Sparse Linear Algebra
Spectral Methods
N-Body Methods
Structured Grids
Unstructured Grids
MapReduce
Combinational Logic
Graph Traversal
Dynamic Programming
Backtrack and Branch-and-Bound
Graphical Models
Finite State Machines

Robert Metzger · Answer 3 · 2014-08-24T13:05:10.153

Update (August 2014): Stratosphere is now called Apache Flink (incubating).

Have a look at Stratosphere. It is another Big Data runtime that offers more operators (map, reduce, join, union, cross, iterate, ...). It also allows to define advanced data flow graphs (with Hadoop MR, you would have to chain jobs).

Stratosphere also supports BSP with its graph processing abstraction (called Spargel).

If you like to read scientific papers, have a look at Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing, it explains the theoretical backgrounds of the system.

Another system in the field is Spark which has its own model (RDDs). Since BSP has been mentioned here, also have a look at GraphLab, the offer an alternative to BSP.

score 0 · Answer 4 · answered May 22 '13 at 20:41

0

Microsoft's Dryad is claimed to be more general than MapReduce.

answered May 22 '13 at 20:41

DarenW

16,549
7
63
102

score 0 · Answer 5 · answered Apr 21 '18 at 16:22

0

Best alternate for MapReduce is Spark, because its 10 to 100 times faster than the MapReduce. And also very easy to maintain, less coding high performance.

answered Apr 21 '18 at 16:22

Praveen K

1

MapReduce alternatives

5 Answers5

Linked