6

I wanted to use T-sne features for DBSCAN clustering algorithm, but sklearn implementation is not running for n_components>4.

from sklearn.manifold import TSNE
X = np.array([[0, 0, 0,2, 0, 0,2], [0, 1, 1,53, 0, 0,2], [1, 0, 1,12, 0, 0,2], [1, 1, 1,75, 0, 0,2]])
X_embedded = TSNE(n_components=5).fit_transform(X)

Error:

ValueError                                Traceback (most recent call last)
<ipython-input-22-79c671f39a06> in <module>
----> 1 tsne_data = model.fit(clustering_ready_data_encoded)

~/anaconda3/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py in fit(self, X, y)
    902         y : Ignored
    903         """
--> 904         self.fit_transform(X)
    905         return self

~/anaconda3/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py in fit_transform(self, X, y)
    884             Embedding of the training data in low-dimensional space.
    885         """
--> 886         embedding = self._fit(X)
    887         self.embedding_ = embedding
    888         return self.embedding_

~/anaconda3/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py in _fit(self, X, skip_num_points)
    685 
    686         if self.method == 'barnes_hut' and self.n_components > 3:
--> 687             raise ValueError("'n_components' should be inferior to 4 for the "
    688                              "barnes_hut algorithm as it relies on "
    689                              "quad-tree or oct-tree.")

ValueError: 'n_components' should be inferior to 4 for the barnes_hut algorithm as it relies on quad-tree or oct-tree.

I know T-sne is not preferred for features in Clustering algorithm but I want to still try.

Chandan Malla
  • 481
  • 5
  • 14

1 Answers1

9

You can set method='exact' as barnes_hut apparently only works when n_components<4.