Questions tagged [sparkling-water]

Sparkling Water integrates H2O's fast scalable machine learning engine with Spark.

Sparkling Water integrates H2O's fast scalable machine learning engine with Spark. It provides:

Utilities to publish Spark data structures (RDDs, DataFrames) as H2O's frames and vice versa. DSL to use Spark data structures as input for H2O's algorithms Basic building blocks to create ML applications utilizing Spark and H2O APIs Python interface enabling use of Sparkling Water directly from pySpark

Getting Started

Select right version

The Sparkling Water is developed in multiple parallel branches. Each branch corresponds to a Spark major release ie for Spark 1.6 use branch sparkling version 1.6

Recommended reference sources:

Sparkling-water installation guide
Sparkling water documentation
Sparkling-water GitHub Documentation

129 questions

vote

0 answers

Why h2o give different prediction over spark cluster from spark local?

H2O in spark cluster mode giving different predictions from spark local mode. H2O in spark local is giving better than spark cluster why it is happening ,can you help me? Tell me whether it's H2O behaviour. Two Data set are being used. One for…

h2o sparkling-water

asked Mar 30 '18 at 15:11

poojanavin

vote

1 answer

How to export an h2o model as MOJO from sparkling water in scala, to be loaded by EasyPredictModelWrapper

My goal is to export an h2o model trained on spark with scala (using sparkling-water), such that I can import it in an application without Spark. Thus: using scala (the documentation only shows examples for r and python) export a model which is…

scala h2o sparkling-water

asked Mar 27 '18 at 14:30

gerben

vote

0 answers

NullPointerException PySparkling H2OFrame to Spark DataFrame

pysparkling 2.1 I run the following code: hc = H2OContext.getOrCreate(spark) h2o_frame = h2o.import_file('hdfs:path/to/my/file.csv') spark_frame = hc.as_spark_frame(h2o_frame) and it works just fine, just like in the documentation. But then when I…

python apache-spark pyspark h2o sparkling-water

asked Feb 23 '18 at 15:46

Tiberiu

vote

1 answer

Is there any performance difference for ML Training between H2O Multi-node cluster and H2O Spark Cluster based on Sparkling Water?

I am curious about the cluster configuration environment in terms of the ML Training performance of H2O. If there are three nodes, is there a performance difference between configuring a generic H2O Multi-node Cluster and configuring an H2O Spark…

cluster-computing h2o sparkling-water

asked Feb 09 '18 at 01:12

김태훈

vote

1 answer

Create Sparkling Water Cloud in Databricks using Python Notebook

I am trying to launch a Sparkling Water cloud within Spark using Databricks. I've attached the H2O library (3.16.0.2), PySparkling (pysparkling 0.4.6), and the Sparkling Water jar (sparkling-water-assembly_2.11-2.1.10-all.jar) to the cluster I'm…

pyspark h2o databricks sparkling-water

asked Dec 22 '17 at 13:38

Frank B.

1,813
5
24
44

vote

1 answer

Which the benefits of Sparking Water over H20 Machine learning Library

I've understood that Sparkling Water is H20 executed on a Spark environment and so it can use the Spark Engine (and all Spark distributed structures) to distribute computing, but in term of performances which are the benefits since H2O is already a…

apache-spark machine-learning h2o sparkling-water

asked Dec 19 '17 at 19:48

xcsob

vote

1 answer

Sparkling Water fails to create h2oContext in simple spark project

I am setting up for the first time Sparkling Water on a standalone cluster running spark 2.2. I have run Sparkling Water on such a cluster before via R (using rsparkling + sparklyr + h2o), but am having issues setting this up as a spark application…

apache-spark h2o sparkling-water

asked Nov 17 '17 at 10:30

renegademonkey

vote

0 answers

Cannot rename spark tables column names in sparklyr/rsparkling

Getting knee deep with sparklyr/rsparkling, I have some spark tables with annoying column names and I would like to rename them. But I cannot seem to do it. library(sparklyr) library(rsparkling) library(dplyr) library(DBI) sc <-…

r sparklyr sparkling-water

asked Nov 02 '17 at 04:41

Chris

1,219
2
11
21

vote

2 answers

LDAP authentication using sparkling-water

We need to authenticate user using LDAP in sparkling-water. We tried configuring the same using Sparkling-water 1.6.13 and h2O 3.14.0.2. Below is the configuration: *ldaploginmodule { org.eclipse.jetty.plus.jaas.spi.LdapLoginModule required …

ldap h2o sparkling-water

asked Oct 25 '17 at 05:59

Satish Agrawal

vote

1 answer

How to change port of web UI with pysparkling

I'm just trying to get pysparkling working, but change the port of the web UI. I've looked in the help files and they seem to reference old versions of sparkling water. Currently am running from pysparkling import * hc =…

h2o sparkling-water

asked Oct 18 '17 at 16:28

chib

vote

0 answers

Sparkling water local mode cluster error

I'm trying to extend the hamorspam example(https://github.com/h2oai/sparkling-water/blob/master/examples/scripts/hamOrSpam.script.scala ) to make parallel predictions for large dataset using spark's parallel computation power(during the inference…

scala apache-spark h2o sparkling-water

asked Jul 01 '17 at 08:46

siv

vote

1 answer

H2O error when calling as.factor on H2O data frame

When I call the following reproducible doce: install.packages("h2o", type = "source", repos = …

r h2o sparklyr sparkling-water

asked Jun 19 '17 at 22:54

Levi Brackman

vote

1 answer

Why does H2O integrate TensorFlow via Spark instead of directly?

I really like H2O especially because you can deploy the built models easily into any Java / JVM application... This is also my goal for TensorFlow: Build models and then run them in Java applications. H2O uses Spark (Sparking Water) "in the middle"…

java apache-spark tensorflow h2o sparkling-water

asked Jun 08 '17 at 11:08

Kai Wähner

5,248
4
35
33

vote

0 answers

Making H2O grid search deterministic

In order to run the h2o RandomDiscreteValueWalker[DRFParameters] with deterministic results, is it sufficient to set the seed on the DRFParameters and the RandomDiscreteValueSearchCriteria ? I get non-deterministic results even when I have the seed…

machine-learning h2o sparkling-water

asked Jun 05 '17 at 20:58

x89a10

vote

0 answers

Curl connection in H2O 3.11.4.8 using Apache Hadoop 2.7.3

I have installed HDP 2.6 in computer cluster with only 2 node. Each node has Processor 2 Core RAM 8 GB Harddisk 40 GB enter image description here I also installed Apache Hadoop 2.7.3, too. Because of that, i can run H2O 3.11.4.8 using YARN. But,…

python r deep-learning h2o sparkling-water

asked May 31 '17 at 03:26

Rendi 7936

Prev 1 2

…

8 9 Next