Questions tagged [sparkling-water]

Sparkling Water integrates H2O's fast scalable machine learning engine with Spark.

Sparkling Water integrates H2O's fast scalable machine learning engine with Spark. It provides:

Utilities to publish Spark data structures (RDDs, DataFrames) as H2O's frames and vice versa. DSL to use Spark data structures as input for H2O's algorithms Basic building blocks to create ML applications utilizing Spark and H2O APIs Python interface enabling use of Sparkling Water directly from pySpark

Getting Started

Select right version

The Sparkling Water is developed in multiple parallel branches. Each branch corresponds to a Spark major release ie for Spark 1.6 use branch sparkling version 1.6

Recommended reference sources:

Sparkling-water installation guide
Sparkling water documentation
Sparkling-water GitHub Documentation

129 questions

vote

0 answers

Running pysparkling-water using Livy spark failed

I have been able to run the ChicagoCrimeDemo.py script using spark-submit successfully (spark-submit --master=yarn-client --py-files /opt/sparkling-water-1.6.10/py/build/dist/h2o_pysparkling_1.6-1.6.10-py2.7.egg…

h2o sparkling-water

asked May 17 '17 at 19:39

Yandy Perez Ramos

vote

1 answer

How to interpret results from Sparkling Water's GBM algorithm on classification task

I'm new to Sparkling Water and machine learning, I've built GBM model with two datasets divided manually into train and test. Task is classification with all numeric atributes (response column is converted to enum type). Code is in Scala. val…

scala h2o gbm sparkling-water

asked Apr 24 '17 at 14:54

velaciela

vote

0 answers

RSparkling: SqlException while accessign metastore_db of hive from RSparkling

I am running RSparkling on Local System with Apache Spark 2.0.1. When I set h2o_context(sc) I get permission exception for /tmp/hive which I set using winutils.exe. After that when I try to run the following command mtcars_tbl <- copy_to(sc,…

apache-spark apache-spark-sql sparkling-water

asked Mar 13 '17 at 11:20

Mansoor

1,157
10
29

vote

1 answer

Can I one only some columns that was used to create a GBM model and still Predict in Supervised Learning.?

In GBM Model - I have near to 150 columns used to train and create a model - I have a case where for some records I won't be getting all the columns. In that case will the model work - I don't want to set the values to 0 in that case.?

apache-spark-mllib h2o supervised-learning gbm sparkling-water

asked Mar 11 '17 at 18:33

DINESHKUMAR MURUGAN

vote

2 answers

H2o Package not found Scala Sparkling Water

I am trying to run Sparkling Water on my Local instance of Spark 2.1.0. I followed documentation on H2o for Sparling Water. But when I try to execute sparkling-shell.cmd I am getting following error : The filename, directory name, or volume label…

scala apache-spark h2o sparkling-water

asked Mar 08 '17 at 10:39

Mansoor

1,157
10
29

vote

3 answers

Spark Shell -The filename, directory name, or volume label syntax is incorrect

I am getting an error while running spark-shell.cmd with following paramters "C:\SoftwareLibraries\spark\spark-2.0.1\bin\spark-shell.cmd" --jars…

apache-spark apache-spark-2.0 sparkling-water

asked Feb 22 '17 at 08:07

Mansoor

1,157
10
29

vote

1 answer

sparklyr + rsparkling: Error while connecting to a cluster

For some time I'm using sparklyr package to connect to companys Hadoop cluster using the…

r hadoop apache-spark sparklyr sparkling-water

asked Feb 14 '17 at 13:32

Maju116

1,607
1
15
30

vote

2 answers

Create a job that goes through H2O Flow automatically

I have created a flow to predict something with the distributed random forest model and now i want to predict every few days, without using the flow gui. So is there a way to automate your H2O Flow or to translate the entire script into java/python…

apache-spark h2o sparkling-water

asked Jan 18 '17 at 17:40

BumbleBeeBro

vote

1 answer

Understanding Sparkling Water

I am new to Sparkling Water, I want to ask some quick questions: Does Sparking Water support all the algorithms that both Spark MLlib and H2O provides Does Sparkling Water itself provide algorithms that Spark MLlib and H2O don't support? If I…

h2o sparkling-water

asked Jan 06 '17 at 07:47

Tom

5,848
12
44
104

vote

0 answers

How to run Sparkling Water example with spark in local mode

I am trying to run sparkling water deep learning demo in IntelliJ IDEA The code link is: https://github.com/h2oai/sparkling-water/blob/RELEASE-2.0.3/examples/src/main/scala/org/apache/spark/examples/h2o/DeepLearningDemo.scala If fails to start, the…

h2o sparkling-water

asked Jan 06 '17 at 03:26

Tom

5,848
12
44
104

vote

1 answer

Sparkling Water: out of memory when converting spark dataframe to H2o dataframe

I am trying to converting Spark DataFrame to H2O DataFrame For spark setup, I am using .setMaster("local[1]") .set("spark.driver.memory", "4g") .set("spark.executor.memory", "4g") and I tried H2O 2.0.2 and H2O 1.6.4. I got both the same error…

apache-spark apache-spark-sql h2o sparkling-water

asked Dec 16 '16 at 04:43

lserlohn

5,878
10
34
52

vote

2 answers

h2o sparkling water save frame to disk

I am trying to import a frame by creating a h2o frame from a spark parquet file. The File is 2GB has about 12M rows and Sparse Vectors with 12k cols. It is not that big in parquet format but the import takes forever. In h2o it is actually reported…

h2o sparkling-water

asked Dec 12 '16 at 15:50

samst

vote

1 answer

Unable to find class: org.apache.spark.h2o.package$StringHolder

I am trying the simple droplet https://github.com/h2oai/sparkling-water program, but I am unable to make it run successfully using spark-submit. I used sparkling water 1.6.4, as used in the sample code. spark-submit --jars…

scala apache-spark h2o sparkling-water

asked Dec 06 '16 at 23:18

lserlohn

5,878
10
34
52

vote

1 answer

the purpose of creating an h2o model

In the demo code https://github.com/h2oai/sparkling-water/blob/master/py/examples/notebooks/TensorFlowDeepLearning.ipynb I can more or less make out what the code is doing. My question is what is the advantage in creating the h2o model at the…

tensorflow sparkling-water

asked Nov 03 '16 at 00:15

bhomass

3,414
8
45
75

vote

2 answers

Sparkling water: Can't make use of the support of spark ml pipelines

According to this blog by the Sparkling water guys, you are now able to use the Spark ML pipelines components to build a DL model in the latest versions. I tried adding the latest versions in my build.sbt "org.apache.spark" % "spark-mllib_2.10" %…

scala apache-spark apache-spark-mllib h2o sparkling-water

asked Oct 03 '16 at 18:55

void

2,403
6
28
53

Prev 1 2 3

…

8 9 Next