1

My project involves trying to predict the sales quantity for a specific item across a whole year. I've used the LightGBM package for making the predictions. The params I've set for it are as follows:

params = {
'nthread': 10,
'max_depth': 5, 
'task': 'train',
'boosting_type': 'gbdt',
'objective': 'regression_l1',
'metric': 'mape', 
'num_leaves': 2, 
'learning_rate': 0.2180, 
'feature_fraction': 0.9, 
'bagging_fraction': 0.990, 
'bagging_freq': 1, 
'lambda_l1': 3.097758978478437, 
'lambda_l2': 2.9482537987198496, 
'verbose': 1,
'min_child_weight': 0.001,
'min_split_gain': 0.037310344962162616,
'min_data_in_bin': 1, 
'min_data_in_leaf':2, 
'num_boost_round': 1, 
'max_bin': 7, 
'extra_trees': True, 
'early_stopping_rounds':-1
}

My dataset consists of daily sales data (columns= date, quantity) for the years 2017, 2018, 2019 and 3 months of 2020. I've been trying to use the 2017 and 2018 data for training and cross-validation and trying to test it for 2019 data. However my predictions for the year is way off the mark while considering the quantities on a weekly, monthly, quarterly or yearly basis (error ~ 40-50%)(I've tuned the params to bring the error down to this values). Moreover while considering the predictions, my r2_score is giving me a negative value of around -2.9148426301633803. Any suggestions on what can be done to make it better?

jottbe
  • 4,228
  • 1
  • 15
  • 31

0 Answers0