I have the following code:
%pyspark
from pyspark.ml.linalg import Vectors
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.clustering import KMeans
from pyspark.ml import Pipeline
(trainingData, testData) = dataFrame.randomSplit([0.7, 0.3])
assembler = VectorAssembler(inputCols = ["PetalLength", "PetalWidth", "SepalLength", "SepalWidth"], outputCol="features")
kmeans = KMeans().setK(3).setSeed(101010)
pipeline = Pipeline(stages=[assembler, kmeans])
modelKMeans = pipeline.fit(dataFrame)
And when I run this:
predictions = modelKMeans.transform(testData)
z.show(predictions)
I want to see in prediction column "Iris-setosa" instead of 0, "Iris-versicolor" instead of 1, and "Iris-virginica" instead of 2. Is it possible?