-1

I'm new in machine learning. I started from linear regression with gradient descent. I have python code for this and I understad this way. My question is: Gradient descent algorithm minimize function, can I plot this function? I want to see what the function in which the minimum is looked like. It possible? My code:

import matplotlib.pyplot as plt import numpy as np

def sigmoid_activation(x):
    return 1.0 / (1 + np.exp(-x))

X = np.array([
    [2.13, 5.49],
    [8.35, 6.74],
    [8.17, 5.79],
    [0.62, 8.54],
    [2.74, 6.92] ])

y = [0, 1, 1, 0, 0]

xdata = [row[0] for row in X] ydata = [row[1] for row in X]

X = np.c_[np.ones((X.shape[0])), X] W = np.random.uniform(size=(X.shape[1], ))

lossHistory = []


for epoch in np.arange(0, 5):

    preds = sigmoid_activation(X.dot(W))
    error = preds - y

    loss = np.sum(error ** 2)
    lossHistory.append(loss)

    gradient = X.T.dot(error) / X.shape[0]
    W += - 0.44 * gradient


plt.scatter(xdata, ydata) plt.show()

plt.plot(np.arange(0, 5), lossHistory) plt.show()

for i in np.random.choice(5, 5):

    activation = sigmoid_activation(X[i].dot(W))
    label = 0 if activation < 0.5 else 1
    print("activation={:.4f}; predicted_label={}, true_label={}".format(
        activation, label, y[i]))


Y = (-W[0] - (W[1] * X)) / W[2]

plt.scatter(X[:, 1], X[:, 2], c=y) plt.plot(X, Y, "r-") plt.show()
lukassz
  • 3,135
  • 7
  • 32
  • 72

1 Answers1

1

With the risk of being obvious... You can simply plot lossHistory with matplotlib. Or am I missing something?

EDIT: apparently the OP asked what the Gradient Descent (GD) is minimizing. I will try to answer here and I hope I can answer the original question.

The GD algorithm is a generic algorithm to find the minimum of a function in parameter space. In your case (and that is how is usually used with Neural Networks) you want to find the minimum of a loss function: the MSE (Mean Squared Error). You implement the GD algorithm updating the weights as you did with

gradient = X.T.dot(error) / X.shape[0]
W += - 0.44 * gradient

The gradient is just the partial derivative of your loss function (the MSE) with respect to the weights. So are effectively minimizing the loss function (MSE). Then you update your weights with a learning rate of 0.44. Then you simply save the value of your loss function in the array

loss = np.sum(error ** 2)
lossHistory.append(loss)

and therefore the lossHistory array contains your cost (or loss) function that you can plot to check your learning process. The plot should show something decreasing. Does this explanation help you?

Best, Umberto

Umberto
  • 1,387
  • 1
  • 13
  • 29
  • Yes, I plot it. I don't known how to connect lossHistory function with gradient descent. Gradient descent find minimum in lossHistory function? – lukassz Mar 21 '18 at 13:37
  • Ok, thanks for explain. Now is clear for me... Which way can I draw points on my lossHistory function like this: http://ml-cheatsheet.readthedocs.io/en/latest/_images/gradient_descent_demystified.png ? – lukassz Mar 22 '18 at 16:29
  • 1
    Now we are discussing two different things. The plot you showed is the cost function in parameter space and the crosses are the weight value after each update. This is something different than the cost function vs the number of epochs. If you want to draw something similar to your plot, you need first to save all the weight values and plot them on the cost function surface in parameter space. The problem you have is that your parameter space has 3 dimensions (bias + weights) so you cannot do a plot like the one you mention (you need 4 with the z axis)... Hope that helps – Umberto Mar 23 '18 at 07:14