1

Search Engine like Google is based on Map Reduce.How doeis it actually happen.Suppose i give a string in search what happens after that what goes into the mapper what doeas the mapper output as key value pair what goes into the reducer ? thanks.

Monica Shiralkar
  • 283
  • 1
  • 3
  • 15

1 Answers1

3

A related question: How can Google be so fast?

I only know what I once discussed with a guy from google (I can't entirely verify the accuracy), but he basically told me they used something map reduce to build all the indexes of all the words that appeared in all web pages. And then to solve queries, they used something like this: http://en.wikipedia.org/wiki/Distributed_hash_table So for each word you give them, they can calculate the hash, and know exactly which computer in their network has the information (the index) related to searches of that word. so they forward the request to that computer (they also use a lot of redundancy), which will probably have most of the information in memory.

Community
  • 1
  • 1
user1494736
  • 2,425
  • 16
  • 8
  • Thanks ..so i understand that its searches the indexing is present on which cluster...Now the questin is..what goes into the Mapper and what goes into the Reducer... – Monica Shiralkar Jul 07 '12 at 18:31
  • MapReduce is not used for realtime querying but rather for batch analytics of what have been searched by users. For example to search frequent search queries. – Thomas Jungblut Jul 07 '12 at 20:05
  • ok.so does that mean.any thing we search in google has nothing to do with map reduce and map reduce is used only for analytics.. – Monica Shiralkar Aug 11 '12 at 11:37