0

Can anyone tell me what would be the best way to practice Spark ? I could see majority of them pointing to install Cloudera VM. I find it pretty hard to work on it as my system has only 6GB of RAM. The process are pretty slow and annoying to me. Cloudera Manger never launches and I believe its because it needs 8GB of RAM. As of now I have given 3 GB to my windows OS and 3GB to my VM. Is there way to speed up the process ? or can I install CDH in Ubuntu and then use spark ? (I have an Ubuntu installation in my machine so that I can allocate the complete 6GB for linux). Or should I be buying a new machine which has 8/16GB RAM ?

Processor : i5 560M RAM: 6GB (5.6 Usable) VM: VMPlayer (VMware)

Can anyone let me know what would a good configuration as well ?

Garfield
  • 396
  • 6
  • 19

1 Answers1

0

To Practice Spark with Hadoop on 6GB machine WINDOWS machine Host

follow these:

  • install Ubuntu as Guest O.S in Oracle VM box with 4 GB RAM (2 GB for WINDOWS machine is fine - assuming you are not running any CPU/memory intensive tasks in WIN)
  • for 4GB VM i strongly suggest , not to go with Cloudera and Cloudera Manager (they will launch a lot of services behind)
  • install hadoop valina stable version 2.4.1 manually(so that you will have control of deamons) use this link for steps - link
  • Assuming

    • Reserved memory for Guest OS - 1 GB Min
    • Memory for Hadoop Single node deamons - 2 GB
    • Rest 1 GB you can use for spark for learning
  • then Install Spark in standalone mode with single node (nothing but your VM)

Hope these helps :)

Netanel Malka
  • 346
  • 4
  • 11
vijay kumar
  • 2,049
  • 1
  • 15
  • 18