-1

I made a gradient descent algorithm in Python and it doesn't work. My m and b values keep increasing and never stop until I get the -inf error or the overflow encountered in square error.

import numpy as np

x = np.array([2,3,4,5])

y = np.array([5,7,9,5])

m = np.random.randn()

b = np.random.randn()

error = 0

lr = 0.0001


for q in range(1000):

        for i in range(len(x)):
            ypred = m*x[i] + b
            error += (ypred - y[i]) **2
        m = m - (x * error) *lr
        b = b - (lr * error)
print(b,m)

I expected my algorithm to return the best m and b values for my data (x and y) but it didn't work. What is going wrong?

halfer
  • 19,824
  • 17
  • 99
  • 186
Dorito
  • 1

2 Answers2

2
import numpy as np

x = np.array([2,3,4,5])
y = 0.3*x+0.6


m = np.random.randn()
b = np.random.randn()


lr = 0.001

for q in range(100000):
    ypred = m*x + b 
    error = (1./(2*len(x))) * np.sum(np.square(ypred - y)) #eq 1
    m = m - lr * np.sum((ypred - y)*x)/len(x) # eq 2 and eq 4
    b = b - lr * np.sum(ypred - y)/len(x)   # eq 3 and eq 5

print (m , b)

Output:

0.30007724168011807 0.5997039817571881

Math behind it

![enter image description here

Use numpy vectorized operations to avoid loops.

mujjiga
  • 16,186
  • 2
  • 33
  • 51
1

I think you implemented the formula incorrectly:

  • Use summation on x - error
  • divide by length of x

See below code:

import numpy as np

x = np.array([2,3,4,5])

y = np.array([5,7,9,11])

m = np.random.randn()

b = np.random.randn()

error = 0

lr = 0.1
print(b, m)

for q in range(1000):
  ypred = []
  for i in range(len(x)):
    temp = m*x[i] + b
    ypred.append(temp)
    error += temp - y[i]
  m = m - np.sum(x * (ypred-y)) *lr/len(x)
  b = b - np.sum(lr * (ypred-y))/len(x)
print(b,m)

Output:

-1.198074371762264 0.058595039571115955   # initial weights
0.9997389097653074 2.0000681277214487     # Final weights
Sociopath
  • 13,068
  • 19
  • 47
  • 75
  • 1
    Your parameter updates are wrong. Also they are not real value they are just some random initializations, and GD is supposed to finetune the parameters that best fit the data. – mujjiga May 09 '19 at 05:55