Is there any performance difference for ML Training between H2O Multi-node cluster and H2O Spark Cluster based on Sparkling Water?

Question

I am curious about the cluster configuration environment in terms of the ML Training performance of H2O.

If there are three nodes, is there a performance difference between configuring a generic H2O Multi-node Cluster and configuring an H2O Spark Cluster based on Spark?

From our experiments, we conclude that there is no obvious performance difference between the two.

However, many of the H2O documents tell me that H2O Sparkling Water is more effective at ML Training.

Reference
- H2O Multi-node Cluster: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/starting-h2o.html#flatfile

Welcome to StackOverflow. This is a programming community, and your question if probably going to see more answers on its sister site instead: https://stats.stackexchange.com/ — Graham, Feb 09 '18 at 01:27

score 1 · Answer 1 · answered Feb 09 '18 at 01:58

1

Your measured observation is correct. There is no difference.

answered Feb 09 '18 at 01:58

TomKraljevic

3,661
11
14

Thank you for your comment! – 김태훈 Feb 10 '18 at 10:10

Is there any performance difference for ML Training between H2O Multi-node cluster and H2O Spark Cluster based on Sparkling Water?

1 Answers1