-1

I apply lasso regression and ridge regression on my forest fire sample dataset however my accuracy is too much low that I should achive

I have already tried to change the alpha and train set values

#Kütüphaneleri importladım
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge
#Dosyami yukledim
forest = pd.read_csv('forestfires.csv')
#Coulmn ve row feaute adlarimi duzenledim
forest.month.replace(('jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'),(1,2,3,4,5,6,7,8,9,10,11,12), inplace=True)
forest.day.replace(('mon','tue','wed','thu','fri','sat','sun'),(1,2,3,4,5,6,7), inplace=True)
# iloc indeksin sırasıyla, loc indeksin kendisiyle işlem yapmaya olanak verir.Burada indeksledim
X = forest.iloc[:,0:12].values
y = forest.iloc[:,12].values
# 30 -70 olarak train test setlerimi ayirdim
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=3)
#x-y axis trainler arasina linear regressyon kurdum
lr = LinearRegression()
lr.fit(X_train, y_train)
#ridge regression modeli kurdum
rr = Ridge(alpha=0.01)
rr.fit(X_train, y_train)

rr100 = Ridge(alpha=100)
rr100.fit(X_train, y_train)
#lasso regression icin modelledim
train_score = lr.score(X_train, y_train)
test_score = lr.score(X_test, y_test)

Ridge_train_score = rr.score(X_train, y_train)
Ridge_test_score = rr.score(X_test, y_test)

Ridge_train_score100 = rr100.score(X_train, y_train)
Ridge_test_score100 = rr100.score(X_test, y_test)

print("linear regression train score:", train_score)
print("linear regression test score:", test_score)
print('ridge regression train score low score: %.2f' % Ridge_train_score)
print('ridge regression test score low score: %.2f' % Ridge_test_score)
print('ridge regression train score high score: %.2f' % Ridge_train_score100)
print('ridge regression test score high score: %.2f' % Ridge_test_score100)
  • Could you please provide a working and executable minimal example, as in [mcve](https://stackoverflow.com/help/mcve)? With the information you provide, it is quite much impossible to solve your problem. Furthermore this is about the internal algorithms of the regression solver, so this might be more suited for stats.stackexchange. – JE_Muc Apr 29 '19 at 13:28

1 Answers1

0

Considering your question: I don't see any Lasso regression in your code. Trying some LassoCV or ElasticNetCV(l1_ratio=[.1, .5, .7, .9, .95, .99, 1]) is always a good start to find reasonable alpha values. For Ridge, RidgeCV is the CV algorithm. In contrast to LassoCV and ElasticNetCV, RidgeCV uses LOO-CV AND takes a fixed set of alpha-values, thus it needs more user-handling for an optimal output. Take for example the given code example below:

import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, LassoCV, ElasticNetCV
from sklearn.linear_model import Ridge, RidgeCV

forest = pd.read_csv('forestfires.csv')
#Coulmn ve row feaute adlarimi duzenledim
forest.month.replace(('jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'),(1,2,3,4,5,6,7,8,9,10,11,12), inplace=True)
forest.day.replace(('mon','tue','wed','thu','fri','sat','sun'),(1,2,3,4,5,6,7), inplace=True)
# iloc indeksin sırasıyla, loc indeksin kendisiyle işlem yapmaya olanak verir.Burada indeksledim
X = forest.iloc[:,0:12].values
y = forest.iloc[:,12].values
# 30 -70 olarak train test setlerimi ayirdim
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=3)
#x-y axis trainler arasina linear regressyon kurdum
lr = LinearRegression()

# The cross validation algorithms:
lasso_cv = LassoCV()  # LassoCV will try to find the best alpha for you
# ElasticNetCV will try to find the best alpha for you, for a given set of combinations of Ridge and Alpha
enet_cv = ElasticNetCV()
ridge_cv = RidgeCV()

lr.fit(X_train, y_train)

lasso_cv.fit(X_train, y_train)
enet_cv.fit(X_train, y_train)
ridge_cv.fit(X_train, y_train)

#ridge regression modeli kurdum
rr = Ridge(alpha=0.01)
rr.fit(X_train, y_train)
rr100 = Ridge(alpha=100)

Now check for the found alpha values with:

print('LassoCV alpha:', lasso_cv.alpha_)
print('RidgeCV alpha:', ridge_cv.alpha_)
print('ElasticNetCV alpha:', enet_cv.alpha_, 'ElasticNetCV l1_ratio:', enet_cv.l1_ratio_)
ridge_alpha = ridge_cv.alpha_
enet_alpha, enet_l1ratio = enet_cv.alpha_, enet_cv.l1_ratio_

And center your new RdigeCV and/or ElasticNetCV around these values (l1_ratios <0 and >1 will be ignored by ElasticNetCV):

enet_new_l1ratios = [enet_l1ratio * mult for mult in [.9, .95, 1, 1.05, 1.1]]
ridge_new_alphas = [ridge_alpha * mult for mult in [.9, .95, 1, 1.05, 1.1]]

# fit Enet and Ridge again:
enet_cv = ElasticNetCV(l1_ratio=enet_new_l1ratios)
ridge_cv = RidgeCV(alphas=ridge_new_alphas)

enet_cv.fit(X_train, y_train)
ridge_cv.fit(X_train, y_train)

This should be the first step to find a good alpha value and/or l1 ratio for your models. Of course other steps as feature engineering and selecting the correct model (f.i. Lasso: performs feature selection) should precede finding good parameters. :)

JE_Muc
  • 5,403
  • 2
  • 26
  • 41
  • Scotty thank you for your good explanation I really appreciate that I am getting used to this models and algorithms although I have a error message that about Future Warning FutureWarning: You should specify a value for 'cv' instead of relying on the default value. The default value will change from 3 to 5 in version 0.22. warnings.warn(CV_WARNING, FutureWarning) Can you help me to fix this issue my sklearn version: 0.20.1 – Cankat Saraç Apr 29 '19 at 18:27
  • You are welcome. And yes, as the warning message states, you should specify a value for cv. for example `enet_cv = ElasticNetCV(cv=5)` and `lasso_cv = LassoCV(cv=5)`. – JE_Muc Apr 29 '19 at 18:34