My project involves trying to predict the sales quantity for a specific item across a whole year. I've used the LightGBM package for making the predictions. The params I've set for it are as follows:
params = {
'nthread': 10,
'max_depth': 5,
'task': 'train',
'boosting_type': 'gbdt',
'objective': 'regression_l1',
'metric': 'mape',
'num_leaves': 2,
'learning_rate': 0.2180,
'feature_fraction': 0.9,
'bagging_fraction': 0.990,
'bagging_freq': 1,
'lambda_l1': 3.097758978478437,
'lambda_l2': 2.9482537987198496,
'verbose': 1,
'min_child_weight': 0.001,
'min_split_gain': 0.037310344962162616,
'min_data_in_bin': 1,
'min_data_in_leaf':2,
'num_boost_round': 1,
'max_bin': 7,
'extra_trees': True,
'early_stopping_rounds':-1
}
My dataset consists of daily sales data (columns= date, quantity) for the years 2017, 2018, 2019 and 3 months of 2020. I've been trying to use the 2017 and 2018 data for training and cross-validation and trying to test it for 2019 data. However my predictions for the year is way off the mark while considering the quantities on a weekly, monthly, quarterly or yearly basis (error ~ 40-50%)(I've tuned the params to bring the error down to this values). Moreover while considering the predictions, my r2_score is giving me a negative value of around -2.9148426301633803
. Any suggestions on what can be done to make it better?