I have a data set of a tea export company and it includes total export and tea types and weight categories.
It looked like this
Date Type Weight Quantity Price
2016-01-01 black bags 1734136.51 1131.30
2016-01-01 black bulk 10722389.66 510.86
2016-01-01 black 4g_1kg 6817078.01 588.72
2016-01-01 black 1kg_3kg 86444.50 565.91
2016-01-01 black 3kg_5kg 1003986.73 552.39
Now that I have grouped the data with this
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date']).dt.date
df['YearMonth'] = df['Date'].map(lambda date: 100*date.year + date.month)
df = df.groupby(['YearMonth','Type', 'Weight']).agg({'Quantity':'sum'})
And the dataframe now looks like this
YearMonth Type Weight Quantity
201601 black 1kg_3kg 86444.50
3kg_5kg 1003986.73
4g_1kg 6817078.01
5kg_10kg 2816810.33
bags 1734136.51
bulk 10722389.66
green 3kg_5kg 12.00
4g_1kg 53014.95
5kg_10kg 1132.00
bags 41658.19
bulk 112400.00
instant 4g_1kg 28.80
lt3kg 89486.40
201602 black 1kg_3kg 215539.60
I tried simple ways to use XGBoost and Linear regressions to predict but it didn't work. What I want is the overall total prediction for few years and individual tea type and weight class predictions. Can someone tell me what' the way to achieve this?