I am using the R programming language and learning about kohonen
package by following this tutorial. The kohonen
R package allows the user to run Kohonen Networks (also called SOM - Self Organizing Maps), a type of unsupervised machine learning algorithm used in data visualization.
I ran the following code and produced the below plots:
#load libraries
library(kohonen) #fitting SOMs
library(RColorBrewer) #colors, using predefined palettes
#process data
iris_complete <-iris[complete.cases(iris),] #only complete cases... the iris dataset floats around in the sky with diamonds.
iris_unique <- unique(iris_complete) # Remove duplicates
iris.sc = scale(iris_unique[, 1:4])
#run the SOM
iris.grid = somgrid(xdim = 10, ydim=10, topo="hexagonal", toroidal = TRUE)
set.seed(33) #for reproducability
iris.som <- som(iris.sc, grid=iris.grid, rlen=700, alpha=c(0.05,0.01), keep.data = TRUE)
#make plots (3 different plots)
plot(iris.som, type="count")
plot(iris.som, type="dist.neighbours",
palette.name=grey.colors, shape = "straight")
var <- 1 #define the variable to plot
plot(iris.som,
type = "property",
property = getCodes(iris.som)[,var],
main=colnames(getCodes(iris.som))[var],
palette.name=terrain.colors)
From here, I am trying to modify these plots so that they are more recognizable. I am trying to add a "label" (a number from 1-100) to each circle so that is easier to identify each circle:
I am not sure if there is a straightforward way to place a number on each corresponding circle. Looking at the som()
function in the kohonen package (https://www.rdocumentation.org/packages/kohonen/versions/2.0.19/topics/som), it seems it is possible to determine which observation belongs to which circle:
#determine which circle each observation belongs to
a = iris.som$unit.classif
#pull the original data
b = iris.som$data
#combine both of them into one frame
c = rbind(a,b)
But I am not sure if it is possible to "superimpose" these numbers on to the corresponding circles. Does anyone know if this can be done?
Update:
I tried the following code:
iris_unique$ID <- seq_along(iris_unique[,1])
plot(iris.som, type="mapping", bg = rgb(colour4), shape = "straight",
border = "grey", labels = iris_unique[,6])
or:
library(plotly)
plot1 = plot(iris.som, type="mapping", bg = rgb(colour4), shape = "straight",
border = "grey", labels = iris_unique[,6])
plotly_plot = ggplotly(plot1)
But I don't think that this is correct.