0

I am a Junior System Administrator with one of the Engineering Schools. One of the Professors got a donation of 45 servers (Dell Poweredge 1690) from Yahoo. Following are his requirements:

  • hadoop (mapreduce) on Linux (which flavor of Linux and Hadoop?)

    • pig on top of hadoop

    • Dryad on top of Windows

    • MPI on linux

    • possibly other software, say for cloud computing

I would like to create a cluster using VMware such that I can utilize the hardware optimally. I am very new virtualization. Can anyone suggest me how to go about it. I really look forward to work on this project as it will give me a good exposure and some hands on experience.

This will be a lab to which many students will be logging in simultaneously. I am planning to use LDAP authentication which will authenticate students with our Active Directory.

So how do I go about it? What strategy will be the best one in this scenario? Any input is appreciated. Thank you.

Anup
  • 1

1 Answers1

3

Sounds like you've got a mountain of knowledge to climb, I doubt you'll be able to adaquately design what you're looking for without learning all the technologies for a solid year or so first.

That said, I'd forget using something fancy like Hadoop. You didn't mention what kind of storage you have, but I'd try to pull some sort of SAN together, possibly powered by FreeNAS, providing iSCSI targets. Hyper-V Server is free and able to form up to 16 node clusters. SCVMM is fairly cheap with an educational discount, and would be able to provide a self service portal with AD Auth to allow students or researchers to access the clusters, deploy pre-configured VMs, or provision their own (lots of configuration options). It's possible to add other technologies to make it more or less complicated as your demands require (this is where you will get into "cloud" [I hate that marketing term, but it's apparently what you know] technologies that enable high availability, dynamic load balancing, work load spreading, proactive management, etc).

VMWare's products could provide much of the same functionality, but I'm not as familiar with their lineup, so I couldn't say if it would be cheaper or not.

Unfortunately your question is quite vague as to what you hope to accomplish. This site has plenty of System Administrators, and we know that a good set of requirements are the key to a successful project. Throwing alphabet soup of technologies into a project and hoping they all work together is a recipe for disaster. We'd be glad to work with you, but you really need a good set of required outcomes before starting the project.

Chris S
  • 77,945
  • 11
  • 124
  • 216
  • Agreed. I can talk all day about hooking Linux into AD, and general Linux clustering with PBS-type tools. But I'm not sure about the faculty requirements (some give requirements based off things they've already been working on, others Google to come up with something that has the right buzzwords). Not sure how VMWare fits in, either (not having to reboot physical hardware from Windows to Linux, maybe?). – Mike Renfro May 07 '11 at 11:40
  • I guess the best thing you can do right now is try to figure out what the end users want to do. Don't ask what technologies you should run on the back end, that will be a function of the UX. So if your users want to have virtual machines to experiment with various Operating Systems/Applications/Configurations, that would be a requirement. But if you're endgame isn't to provide services to "customers" (the primary idea behind "cloud") but instead to experiment with the computer hardware in various configurations, I'd recommend doing smaller clusters in several different setups. – Chris S May 07 '11 at 14:51