3

I know that Hadoop is based on Master/Slave architecture

HDFS works with NameNodes and DataNodes

and MapReduce works with jobtrackers and Tasktrackers

But I can't find all these services on MapR, I find out that it has its own Architecture with its own services

I'm a little bit confused, could any one please tell me what is the difference between using Hadoop only and using it with MapR !

Ravindra babu
  • 37,698
  • 11
  • 250
  • 211

3 Answers3

6

You have to refer to Hadoop 2.x latest architecture since YARN ( Yet Another Resource Negotiator) & High Availability have been introduced in 2.x version.

Job tracker and Task tracker are replaced with Resource Manager, Node Manager and Applications Manager.

Hadoop 2.x YARN & High Availability

For MapR architecture, refer to MapR article

For comparison between different distributors, refer to this image

enter image description here

Detailed comparison is available at Data-magnum article by Bill Vorhies

Ravindra babu
  • 37,698
  • 11
  • 250
  • 211
3

MapR and apache Hadoop DO NOT have same architecture at storage level. MapR uses its own filesystem MaRFS which is completely different from HDFS in terms of concept and implemenation . you can find more detailed comparision here : https://www.mapr.com/blog/comparing-mapr-fs-and-hdfs-nfs-and-snapshots#.VfGwwxG6eUk https://www.mapr.com/resources/videos/comparison-mapr-fs-and-hdfs

Pradeep Bhadani
  • 4,435
  • 6
  • 29
  • 48
  • Mapr uses 80% of Apache distribution as their baseline. See the chart here. http://www.networkworld.com/article/2369327/software/comparing-the-top-hadoop-distributions.html – Gyanendra Dwivedi Sep 11 '15 at 07:37
  • 1
    MapR does support computational tools comes from Apache but it is completely different from Storage (Files system) point of view and cluster services (this questions context) – Pradeep Bhadani Sep 11 '15 at 09:45
  • Links are dead. – Jolta Oct 28 '20 at 12:58
2

Mapr uses most of Apache bigdata distributions as their baseline. enter image description here Mapr is a hadoop (and bigdata technology stacks) distribution provider with certain add-ons and technical support to its client.

Underline the mapr is entirely on the same architecture as of apache hadoop including all the core library distribution. However mapr distribution is more like a bundle of a complete and compatible bigdata technology package.

The main benefit of mapr is that it's distribution of various technologies like hive, hbase, spark etc will be compatible with core hadoop and among each other. This I'd particularly important because the bigdata technologies are evolving in different pace and hence news releases becomes incompatible very soon.

So, the vendors like mapr, cloudera etc are providing their version of hadoop didtribution and support such that end users can concentrate on the product building without worrying about the compatibility issues. But almost all of them are using apache distribution under the carpet.

In future, they might come up certain variation and additional features in an attempt to prevent client's switch to other vendors, but as of now is not the case.

Gyanendra Dwivedi
  • 5,511
  • 2
  • 27
  • 53
  • MapR and apache Hadoop DO NOT have same architecture at storage level. MapR uses its own filesystem MaRFS which is completely different from HDFS in terms of concept and implemenation . you can find more detailed comparision here : https://www.mapr.com/resources/videos/comparison-mapr-fs-and-hdfs https://www.mapr.com/blog/comparing-mapr-fs-and-hdfs-nfs-and-snapshots#.VfGrUBG6eUk – Pradeep Bhadani Sep 10 '15 at 16:07
  • @Pradeep, Did I say Mapr and apache distribution is exactly same? Can you confirm that Mapr has their proprietary version of all of their Bigdata platform and has not taken Apache distribution as baseline? – Gyanendra Dwivedi Sep 11 '15 at 07:44
  • I said they are not same at the storage level . Concept of storing data in MapRFS and HDFS is completely different. Above diagram compares the computational tools (not storage part). MapR does support majorly all computational tools(like MR, hive ,pig etc..) which comes from apache. The question asked was in context of HDFS architecture. – Pradeep Bhadani Sep 11 '15 at 09:44