Why isn't my gradient descent algorithm working?

Question

I made a gradient descent algorithm in Python and it doesn't work. My m and b values keep increasing and never stop until I get the -inf error or the overflow encountered in square error.

import numpy as np

x = np.array([2,3,4,5])

y = np.array([5,7,9,5])

m = np.random.randn()

b = np.random.randn()

error = 0

lr = 0.0001


for q in range(1000):

        for i in range(len(x)):
            ypred = m*x[i] + b
            error += (ypred - y[i]) **2
        m = m - (x * error) *lr
        b = b - (lr * error)
print(b,m)

I expected my algorithm to return the best m and b values for my data (x and y) but it didn't work. What is going wrong?

What would be the correct formula, and do you have the explanation for why my code didn't work? — Dorito, May 09 '19 at 05:29

mujjiga · Answer 1 · 2019-05-09T06:14:28.493

2

import numpy as np

x = np.array([2,3,4,5])
y = 0.3*x+0.6


m = np.random.randn()
b = np.random.randn()


lr = 0.001

for q in range(100000):
    ypred = m*x + b 
    error = (1./(2*len(x))) * np.sum(np.square(ypred - y)) #eq 1
    m = m - lr * np.sum((ypred - y)*x)/len(x) # eq 2 and eq 4
    b = b - lr * np.sum(ypred - y)/len(x)   # eq 3 and eq 5

print (m , b)

Output:

0.30007724168011807 0.5997039817571881

Math behind it

Use numpy vectorized operations to avoid loops.

edited May 09 '19 at 06:14

answered May 09 '19 at 05:52

mujjiga

16,186
2
33
51

Thank you for your help! – Dorito May 10 '19 at 02:29

Sociopath · Answer 2 · 2019-05-09T06:20:23.023

1

I think you implemented the formula incorrectly:

Use summation on x - error
divide by length of x

See below code:

import numpy as np

x = np.array([2,3,4,5])

y = np.array([5,7,9,11])

m = np.random.randn()

b = np.random.randn()

error = 0

lr = 0.1
print(b, m)

for q in range(1000):
  ypred = []
  for i in range(len(x)):
    temp = m*x[i] + b
    ypred.append(temp)
    error += temp - y[i]
  m = m - np.sum(x * (ypred-y)) *lr/len(x)
  b = b - np.sum(lr * (ypred-y))/len(x)
print(b,m)

Output:

-1.198074371762264 0.058595039571115955   # initial weights
0.9997389097653074 2.0000681277214487     # Final weights

edited May 09 '19 at 06:20

answered May 09 '19 at 05:49

Sociopath

13,068
19
47
75

1

Your parameter updates are wrong. Also they are not real value they are just some random initializations, and GD is supposed to finetune the parameters that best fit the data. – mujjiga May 09 '19 at 05:55

Why isn't my gradient descent algorithm working?

2 Answers2

Output:

Math behind it