Which clusetring machine learning algorithm is best to be used for clustering one-dimensional numerical features (scalar values)? Is it Birch, Spectral clustering, k-means, DBSCAN...or something else?
2 Answers
All of these methods are better for multivariate data. Except for k-means which historically was used on oneudimensional data, they were all designed with the multivariate problem in mind, and none of them is well optimized for the particular case of 1-dimensional data.
For one-dimensional data, use kernel density estimation. KDE is a nice technique in 1d, has a strong statistical support, and becomes hard to use for clustering in multiple dimensions.

- 76,138
- 12
- 138
- 194
Take a look at K-means clustering algorithm. This algorithm works really well for clustering one dimensional feature vectors. But K means clustering algorithm doesn't work very well when there are outliers in your training dataset in which case you can use some advanced machine learning algorithms.
I'd suggest that before implementing a machine learning algorithm (classification, clustering etc.) for your dataset and problem statement, you can use Weka Toolkit to check which algorithm best fits your problem statement. Weka toolkit is a collection of a large number of machine learning and data mining algorithms that can be easily implemented for a given question. Once you have identified which algorithm works best for your problem, you can modify or write your own implementation of the algorithm. By tweaking it, you can even achieve more accuracy. You can download weka from here.

- 1,172
- 11
- 32