-3

I am not able to understand the real essence of hadoop. If I have the enough resources to buy a supercomputer that can process petabytes of data, then why would I need a Hadoop infrastructure to manage such huge data?

N2M
  • 199
  • 1
  • 15
  • Thank you people. This question was asked to me for an interview. I tried convincing the interviewer with all these OBVIOUS answers, but he was not convinced. So I just wanted to know if there is something unique about hadoop that I am unaware of. – N2M Aug 06 '13 at 08:04

2 Answers2

2

The whole point of hadoop is to be able to process huge amount of data on commodity heterogeneous machines. This has nothing to rule out the use of super computers.

1

Having enough resources often make us dumb. Let me give you an example(don't worry, it involves Hadoop) which will make it clear. The cost of Cray's cheapest supercomputer, XC30-AC is $500,000(IIRC). And what is the cost of a decent computer with decent RAM, CPU and disk???And how much would you need to buy a bunch of them and use their power collectively???How much space and resources do you need to place and handle these machines???How difficult is it to find folks with decent programming skills so that they can write MR jobs for you???

These are just a few things. Hadoop is open source. Use it and tweak it as you wish. Get awesome support through the mailing list for free. Not only support, but suggestions as well. I hope you get the point.

Utilizing your resources wisely is more important than just having them.

Tariq
  • 34,076
  • 8
  • 57
  • 79
  • Thanks for the reply.. I knew this before itself.. But this doesn't answer my question of how hadoop is better over any supercomputer(ignore the cost). – N2M Aug 06 '13 at 08:05
  • You are welcome @N2M. Better in terms of what?My better can be your worse. – Tariq Aug 06 '13 at 08:13
  • Its ok. never mind.. I am convinced that hadoop is what it is today mainly because its deployable on commodity hardwares, and as long as buisness is concerned, profit is aimed at lesser cost, and so hadoop would definitely be an option over supercomputers. Also, since its open-source and easy to use, anybody and everybody can use it freely and the support available is super-awesome. Please correct me if I am wrong anywhere. – N2M Aug 06 '13 at 11:27