Clustering of Variables in python

Question

I have hundreds of variables with binary values i.e., 1 & 0 and I want to see how these variables fall into different clusters? I don't see any python methods to apply. But I can see one in R: http://arxiv.org/pdf/1112.0295.pdf

For example, I have data with variables(features) a1, a2, a3, a4,.......,a100. Each a's are binary variables. Instead of applying clustering on observations I want to apply clustering on a1,a2,...,a100 and want to see in which clusters a1 falls or a2 falls.

Does any one know similar package or methods in python? I tried to apply R interface in Anaconda so that I can use R methods but interface is not working.

Python 3.4.3 |Anaconda 2.3.0 (64-bit)|

score 3 · Answer 1 · answered Nov 11 '15 at 00:03

3

First transpose your data matrix.

Then cluster features instead of instances!

answered Nov 11 '15 at 00:03

Has QUIT--Anony-Mousse

76,138
12
138
194

score 0 · Answer 2 · answered Nov 10 '15 at 17:50

0

The package scikit-learn has exactly what you are looking for.

It contains a lot of clustering algorithms like K-means,Affinity propagation, Mean-shift, Spectral clustering, Ward hierarchical clustering, Agglomerative clustering, DBSCAN, Gaussian Mixtures and more..

answered Nov 10 '15 at 17:50

Niki van Stein

10,564
3
29
62

All those methods, in scikit-learn, are applied on observations not on variables. I have updated original question to make it more clear. – Sanoj Nov 10 '15 at 18:02
@Sanoj, is what you want to do not like PCA? (http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html) – Niki van Stein Nov 10 '15 at 20:48
You can also look here: http://stats.stackexchange.com/questions/138325/clustering-a-correlation-matrix the answer provided code for covariance clustering. – Niki van Stein Nov 10 '15 at 20:50

Clustering of Variables in python

2 Answers2