I have a data set consisting from 6-dimensional data points. I want to produce a self-organizing map for this data to see how my data is clustered and how many different clusters are there in my dataset. My dataset is UNLABELED. And all the examples that I came across are all labelled(iris dataset). I have used various python packages(minisom, sompy, susi) to implement SOM but I am unable to visualize and interpret those results.
I would request this community to help me with this and I would really appreciate if you can provide a link to good work on >3 dimensional data based on SOM-clustering with proper evaluation of results.
MORE INFO:::::::::::
Thanks. I was able to understand the UMATRIX. However, I am still struggling to cluster similar datapoints.
This is a sample of dataset:
A B C D E F
1 0.000613 150386 20.279685 39400220.0 0.672270
1 0.000649 154428 21.069894 8444300.0 0.466464
1 0.000276 154017 20.890017 12361590.0 0.399357
1 0.000186 68675 20.419599 13973180.0 0.430975
1 0.000177 60795 23.276564 5686630.0 0.372155
This is the result of the of the SOM clustering :
A B C D E F Cluster-id
5 1.096415e-07 274 12.599589 4870.0 0.000060 19
5 1.185185e-07 205 12.108413 10000.0 0.000402 19
5 1.131892e-07 221 12.282051 290.0 0.000014 19
5 1.447471e-07 338 12.708078 1750.0 0.000027 19
5 8.218939e-08 244 12.000000 30.0 0.000027 19
... ... ... ... ... ... ... ... ...
5 2.425165e-08 26 12.517500 2020.0 0.000025 19
5 2.926305e-08 51 12.051724 2320.0 0.000012 19
5 2.326685e-08 18 11.724138 290.0 0.000009 19
5 2.465502e-08 18 12.288000 2500.0 0.000018 19
5 5.118597e-08 80 11.776271 2950.0 0.000093 19
If you look at the above result attribute C and attribute E are varying significantly as compared to other attributes even though they belong to the same cluster What is the plausible reason behind this?
and How can I solve this with the aim to have a cluster with similar data points?????(FYI: I did standard scaling on the dataset to equalize the variance of each attribute)