I'm using dbscan from sklearn and HDBSCAN to cluster some documents.
vectorizer = TfidfVectorizer(stop_words=mystopwords)
X = vectorizer.fit_transform(y)
dbscan = DBSCAN(eps=0.75, min_samples = 9)
clusters = dbscan.fit_predict(X)
Now how can I get the top terms in each cluster? When using kmeans we do something like below :
order_centroids = kmeans_model.cluster_centers_.argsort()[:, ::-1]
for i in range(true_k):
print("Cluster %d:" % i),
for ind in order_centroids[i, :true_k]:
print(' %s' % terms[ind])
But in dbscan and hdbscan we don't have centroids. How can we find the top terms in clusters of dbscan or hdbscan?