1

I'm currently working on a project using machine learning to determine whether a network flow is a botnet or benign flow. Of course in the process, I've been using different methods of data analysis, including visualization through self-organizing maps. I'm very new to the concept of SOMs, so please let me know if I'm making incorrect assumptions.

I've so far created self-organizing maps for a dataset with 6 dimensions using the SOMPY library: https://github.com/sevamoo/SOMPY

Essentially where I am stuck is labeling concentrations of botnet/benign flows within the map using this library. Finding trends with each dimension isn't very useful unless I can find the relationship between the clusters and types of flows.

So, is there any way of labeling SOMs using SOMPY where I can compare concentrations of flows to clusters in the other maps?

If SOMPY isn't sufficient, what other libraries would you suggest? Preferably Python, since I have more experience in that language.

Synchrypha
  • 11
  • 3
  • 1
    I usually hand code these, it is not very difficult. The book by Kohonen seems to be the go-to reference. – DrM Jul 12 '18 at 01:06

1 Answers1

0

Do you have labels for your data?

  • With labels: Use the classification ability of the SUSI package which works like a better majority vote.
  • Without labels: Look at the u-matrix of your data in the SUSI package, use its borders as cluster borders and look at the statistics of the different clusters.
felice
  • 1,185
  • 1
  • 13
  • 27