I am using historical data to predict car accident costs. I don't want to completely ignore older data, but I would like to weigh more recent data as more important when training the model. I plan on trying several sklearn regressors such as linear regression and random forest regression, is there a way to incorporate this concept into a sklearn model?
Asked
Active
Viewed 1,007 times
0
-
1Please show what you have tried. sklearn's `LinearRegression` class supports weights for samples. Have you tried it? – Gilad Green Jun 12 '20 at 16:28
-
Thank you! I haven't built the model yet, I was first trying to figure out how the weighting would be possible. If I have a column which is the number of years passed since the accident, how do I convert that into a value which can be passed in as an appropriate weight so that the greater the number of years the smaller the weight? – tshwizz Jun 12 '20 at 17:11
-
1Use `sample_weight` param in a `.fit()` method. See [docs](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.fit) or [here](https://stackoverflow.com/questions/35236836/weighted-linear-regression-with-scikit-learn) for explanation, – Sergey Bushmanov Jun 12 '20 at 22:50
-
@tshwizz did you figure out the solution? how to give the sample weight based on date? – Cy T Mar 02 '22 at 02:25