I am working with a real estate dataset, the size of which is about 21 thousand, the size of the training data is 15129. There are 15 features. The task is to implement manual linear regression using SGD and compare features weights with the weights that the sklearn linear regression model gives us. ( all data is normalized using sklearn StandardScaler )
def gradient3(X,y):
X = pd.DataFrame(X)
y = pd.DataFrame(y)
w1 = np.random.randn(len(X.axes[1]))
w2 = np.random.randn(len(X.axes[1]))
b = 0
eps = 0.001
alpha = 1
counter = 1
lmbda = 0.1
while np.linalg.norm(w1 - w2) > eps:
#choosing random index
rand_index = np.random.randint(len(X.axes[0]))
X_tr = X.loc[rand_index].values
y_tr = y.loc[rand_index].values
# colculating new w
err = X_tr.dot(w1) + b - y_tr
loss_w = 2 * err * X_tr + (lmbda * w1)
loss_b = 2 * err
w2 = w1.copy()
w1 = w1 - alpha * loss_w
b = b - alpha * loss_b
# reducing alpha
counter += 1
alpha = 1/counter
return w1, b
I tried implement SGD and expect to get list of feature weights – w, and bias value – b. The problem is that the program sometimes just goes into an infinite loop, sometimes it shows me absolutely chaotic weights, it depends on my learning rate parameter (alpha) and how fast it decreases. I don't quite understand what exactly the problem is. Maybe SGD just doesn't work with this dataset and I need a mini-batch, maybe I missed something in the algorithm, maybe I'm implementing regularization incorrectly. I would be very grateful if someone could write what is wrong with my implementation.