I have the various lines of code to produce a k-means cluster diagram. Rather than repeat the code for the various different data sets, I wanted to create a function that automates this.
I envisaged it working by having 3 arguments - x, y, and z.
Below is what I have got so far. I'd really welcome any assistance.
I am using Python 3 in Jupyter Notebook and Pandas, Matplotlib, sklearn packages.
x = chosen correlation (moving average data set - plotted on x-axis)
y = chosen index change (y axis data set)
z = corresponding subset (various dataframes which hold the different x & y combinations)
def make_cluster(x,y,z):
model = KMeans(n_clusters = 6)
model.fit(scale(z))
z.plot.scatter(x=x, y=y)
plt.xlabel('Correlation')
plt.ylabel('Daily Return')
plt.grid()
plt.title(str(x) + "Day /" + str(y) + "Daily Performance")
plt.show()
groups = z.groupby('cluster')
fig, ax = plt.subplots()
for name, group in groups:
ax.plot(group.x, group.y, marker='o', linestyle='', label=name)
Examples of x, y and z variables as follows:
# Z Example
UK30 = Raw[['Cor30', 'FTSE100change']]
# X Example
Cor30 = str('Cor30')
# Y Example
FTSE100change = str('FTSE100change')
I am trying to get to the position where I can run the function "make_cluster(x, y, z)" and when it is run, it returns the clustering diagram for the relevant arguments.
Whatever is inputted as the arguments, I wanted to be reflected in the code where the corresponding "x", "y" and "z" appear.
Hopefully this makes sense!