How to label data in SOM using SOMPY library?

Question

I'm currently working on a project using machine learning to determine whether a network flow is a botnet or benign flow. Of course in the process, I've been using different methods of data analysis, including visualization through self-organizing maps. I'm very new to the concept of SOMs, so please let me know if I'm making incorrect assumptions.

I've so far created self-organizing maps for a dataset with 6 dimensions using the SOMPY library: https://github.com/sevamoo/SOMPY

Essentially where I am stuck is labeling concentrations of botnet/benign flows within the map using this library. Finding trends with each dimension isn't very useful unless I can find the relationship between the clusters and types of flows.

So, is there any way of labeling SOMs using SOMPY where I can compare concentrations of flows to clusters in the other maps?

If SOMPY isn't sufficient, what other libraries would you suggest? Preferably Python, since I have more experience in that language.

I usually hand code these, it is not very difficult. The book by Kohonen seems to be the go-to reference. — DrM, Jul 12 '18 at 01:06

score 0 · Answer 1 · answered Apr 23 '19 at 02:41

Do you have labels for your data?

With labels: Use the classification ability of the SUSI package which works like a better majority vote.
Without labels: Look at the u-matrix of your data in the SUSI package, use its borders as cluster borders and look at the statistics of the different clusters.

How to label data in SOM using SOMPY library?

1 Answers1