I was wondering about how to set up a h2o cluster using multiple AWS EC2 instances and R-Studio. I am not a computer scientist, so sorry for the trivial questions (!)
Based on this tutorial (http://amunategui.github.io/h2o-on-aws/) I sucessfully installed h2o and R-Studio on an AWS EC2 instance (Linux). But I rather want to create a multi-instance cluster with lets say 4 instance with 8 cores each.
Following this (http://h2o-release.s3.amazonaws.com/h2o/rel-lambert/5/docs-website/deployment/multinode.html) document, I need a flatfile.txt where I can list all IPs and ports of each EC2 instance. In a next step, I have to copy this file to each node in the cluster and afterwards I need to start a cluster via the java command line... Since I am not a computer scientist as I already mentioned, some questions emerged:
- Where do I find the IPs and ports of each h2o instance?
- How exactly can I copy the resulting file to each node?
- From step 5 on I am completely confused; where do I have to insert this line / where can I find the java comand line?
- I dont want to use the Web UI of h2o, so how can I access the cluster from R-Studio (installed on one of the instances) ?
Thank you so much in advance!