-2

I have done a linear regression problem with boston dataset and I have obtained the next results:

Loss value does not change with increasing number of value. What is the reason of this mistake? Please, help me

import pandas as pd
import torch
import numpy as np
import torch.nn as nn
from sklearn import preprocessing
training_set=pd.read_csv('boston_data.csv')
training_set=training_set.to_numpy()
test_set=test_set.to_numpy()
inputs=training_set[:,0:13]
inputs=preprocessing.normalize(inputs)
target=training_set[:,13:14]
target=preprocessing.normalize(target)
inputs=torch.from_numpy(inputs)
target=torch.from_numpy(target)
test_set=torch.from_numpy(test_set)
w=torch.randn(13,1,requires_grad=True)
b=torch.randn(404,1,requires_grad=True)
def model(x):
    return x@w+b
pred=model(inputs.float())
def loss_MSE(x,y):
    ras=x-y
    return torch.sum(ras * ras) / ras.numel()
for i in range(100):
    pred=model(inputs.float())
    loss=loss_MSE(target,pred)
    loss.backward()
    with torch.no_grad():
        w -= w.grad * 1e-5
        b -= b.grad * 1e-5
        w.grad.zero_()
        b.grad.zero_()
    print(loss) 
David Duran
  • 1,786
  • 1
  • 25
  • 36

1 Answers1

0

welcome to Stackoverflow

Your main loop is fine (you could have made you life much easier however, you should probably read this), but your learning rate (1e-5) is most likely way too low.

I tried with a small dummy problem, it was solved very quickly with a learning rate ~1e-2, and would take tremendously longer with 1e-5. It does converge anyway though, but after much more than 100 epochs. You mentionned you tried increasing the number of epoch, did not write how with many you actually ran the experiment. Please try increasing this parameter (learning rate) to see whether it solves your issue. You can also try removing the division by numel(), which will have the same effect (the division is also applied to the gradients).

Next time, please provide a small minimal example than can be run and help reproduce your error. Here most of your code is data loading which can be replaced with 2 lines of dummy data generation.

trialNerror
  • 3,255
  • 7
  • 18