I have gone through this question but the solution doesn't help. ELKI Kmeans clustering Task failed error for high dimensional data
This is my first time with ELKI so, please bear with me. I have 45000 2D data points (after performing doc2vec ) that contain negative values and are not normalized. The dataset looks something like this :
-4.688612 32.793335
-42.990147 -20.499323
-24.948868 -10.822767
-45.502155 -40.917801
27.979715 -40.012688
1.867812 -9.838544
56.284512 6.756072
I am using the K-means algorithm to get 2 clusters. However, I get the following error:
Task failed
de.lmu.ifi.dbs.elki.data.type.NoSupportedDataTypeException: No data type found satisfying: NumberVector,field AND NumberVector,variable
Available types: DBID DoubleVector,variable,mindim=0,maxdim=1 LabelList
at de.lmu.ifi.dbs.elki.database.AbstractDatabase.getRelation(AbstractDatabase.java:126)
at de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm.run(AbstractAlgorithm.java:81)
at de.lmu.ifi.dbs.elki.workflow.AlgorithmStep.runAlgorithms(AlgorithmStep.java:105)
at de.lmu.ifi.dbs.elki.KDDTask.run(KDDTask.java:112)
at de.lmu.ifi.dbs.elki.application.KDDCLIApplication.run(KDDCLIApplication.java:61)
at [...]
So my question is, does ELKI require the data to be in the range of [0,1] because all the examples that I came across had their data within that range.
Or is it that ELKI does not accept negative values?
If something else, can someone please guide me through this?
Thank you!