0

I`m trying to apply SOM on my dataset, I firstly pre-processed the dataset that consists of 25 column and now it looks like this: enter image description here

The data is for electricity consumption for two years and there are 25 houses in the dataset. after preprocessing the data, here is the code I`ve done so far:

import sys
sys.path.insert(0, '../')
%load_ext autoreload

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pylab import plot,axis,show,pcolor,colorbar,bone
from matplotlib.patches import Patch
%matplotlib inline

from minisom import MiniSom
%autoreload 2
X=data1.values
som =MiniSom(3, 3, X.shape[1], sigma=1.5, learning_rate=0.7)
som.random_weights_init(X)
som.train_batch(data=X ,num_iteration=1000,verbose=True)
from pylab import plot,axis,show,pcolor,colorbar,bone
bone()
pcolor(som.distance_map().T) # distance map as background
colorbar()
markers = ['o','s','D']
colors = ['r','g','b']
for cnt,xx in enumerate(X):
 w = som.winner(xx) 
 plot(w[0]+.5,w[1]+.5,markers[cnt],markerfacecolor='None',
   markeredgecolor=colors[cnt],markersize=12,markeredgewidth=2)
axis([0,som.weights.shape[0],0,som.weights.shape[1]])
show() 

When I`m running the code, it is giving me the following error:

IndexError                                Traceback (most recent call last)
<ipython-input-21-c647ad8d8f9d> in <module>
      9  w = som.winner(xx) # getting the winner
     10  # palce a marker on the winning position for the sample xx
---> 11  plot(w[0]+.5,w[1]+.5,markers[cnt],markerfacecolor='None',
     12    markeredgecolor=colors[cnt],markersize=12,markeredgewidth=2)
     13 axis([0,som.weights.shape[0],0,som.weights.shape[1]])

IndexError: list index out of range

It seems like it is only plotting the three markers I assigned without cycling over the rest of the dataset. I would be appreciated and grateful if any one can give me some tips or solution to this issue.

Aghyad Skaif
  • 49
  • 10
  • `markers[cnt]` and `colors[cnt]` are out of range when cnt is 3 or larger. You might want to use `markers[cnt%3]` and `colors[cnt%3]`. – JohanC Jul 17 '20 at 11:59
  • Thanks for your answer. What I actually wanna do is to cluster the data in 3 clusters where every input vector of the dataset has to belong to one of these three clusters which means each input vector has to have one marker out of the three, do u have a solution for that? – Aghyad Skaif Jul 17 '20 at 12:38

0 Answers0