Questions tagged [flinkml]

FlinkML is the machine learning library for the Apache Flink distributed streaming engine.

FlinkML is the Machine Learning (ML) library for Flink. It is a new effort in the Flink community, with a growing list of algorithms and contributors. FlinkML aims to provide scalable ML algorithms, an intuitive API, and tools that help minimize glue code in end-to-end ML systems.

Getting Started

If you want to jump right in, you have to set up a Flink program. Next, you have to add the FlinkML dependency to the pom.xml of your project.

<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-ml</artifactId>
  <version>1.0-SNAPSHOT</version>
</dependency>

Now you can start solving your analysis task. The following code snippet shows how easy it is to train a multiple linear regression model.

// LabeledVector is a feature vector with a label (class or real value)
val trainingData: DataSet[LabeledVector] = ...
val testingData: DataSet[Vector] = ...

val mlr = MultipleLinearRegression()
  .setStepsize(1.0)
  .setIterations(100)
  .setConvergenceThreshold(0.001)

mlr.fit(trainingData, parameters)

// The fitted model can now be used to make predictions
val predictions: DataSet[LabeledVector] = mlr.predict(testingData)

Learn more about FlinkML here.

34 questions
0
votes
1 answer

Flink Multiple Linear Regression: does it have Predict?

I've a multiple regression model trained and now I want to use it to predict. Reading the documents I understand that the input is a labeled vector and the output is a Dataset with tuple [InputValue, PredictValue], right? I create my labeled…
Borja
  • 194
  • 1
  • 3
  • 17
0
votes
1 answer

apache-flink KMeans operation on UnsortedGrouping

I have a flink DataSet (read from a file) that contains sensor readings from many different sensors. I use flinks groupBy() method to organize the data as an UnsortedGrouping per sensor. Next, I would like to run the KMeans algorithm on every…
Nils Tijtgat
  • 206
  • 4
  • 9
0
votes
1 answer

flink MultipleLinearRegression fit take 3 params

I follow the example of https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/batch/libs/ml/multiple_linear_regression.html but in the example the fit function only need one param,but in my code , fit require three…
0
votes
2 answers

Error with Flink 0.10.1

With flink 0.10.1 in local I can't connect with jobmanager due the following error: Association with remote system [akka.tcp://flink@127.0.0.1:49789] has failed, address is now gated for [5000] ms. Reason is: [scala.Option; local class…
J.F.
  • 307
  • 1
  • 4
  • 18
1 2
3