4

I am very new to doing time series in Python and Prophet. I have a dataset with the variables article code, date and quantity sold. I am trying to forecast the quantity sold for each article for each month using Prophet in python. Dataset

I tried using for loop for performing the forecast for each article, But I am not sure how to display the article type in output(forecast) data and also write it to a file directly from the "for loop".

df2 = df2.rename(columns={'Date of the document': 'ds','Quantity sold': 'y'})
for article in df2['Article bar code']:

    # set the uncertainty interval to 95% (the Prophet default is 80%)
    my_model = Prophet(weekly_seasonality= True, daily_seasonality=True,seasonality_prior_scale=1.0)
    my_model.fit(df2)
    future_dates = my_model.make_future_dataframe(periods=6, freq='MS')
    forecast = my_model.predict(future_dates)
return forecast

I want the output like below, and want this to be written to an output file directly from the "for loop".

Output Expected

Thanks in Advance.

vishnu prashanth
  • 409
  • 11
  • 21

2 Answers2

5

Separate your dataframe by articletype and then try storing all your predicted values in a dictionary

def get_prediction(df):
    prediction = {}
    df = df.rename(columns={'Date of the document': 'ds','Quantity sold': 'y', 'Article bar code': 'article'})
    list_articles = df2.article.unique()

    for article in list_articles:
        article_df = df2.loc[df2['article'] == article]
        # set the uncertainty interval to 95% (the Prophet default is 80%)
        my_model = Prophet(weekly_seasonality= True, daily_seasonality=True,seasonality_prior_scale=1.0)
        my_model.fit(article_df)
        future_dates = my_model.make_future_dataframe(periods=6, freq='MS')
        forecast = my_model.predict(future_dates)
        prediction[article] = forecast
    return prediction

now the prediction will have forecasts for each type of article.

Vj-
  • 722
  • 6
  • 18
  • I have executed the solution now, but since the total levels in the article barcode is around 2500 and the number of records is over 3M, it is taking a long time. Will check and update the answer. Thanks @Vj- – vishnu prashanth Mar 12 '18 at 17:44
  • @Vj there is a typo in line 3 `df = df.rename(columns={'Date of the document': 'ds','Quantity sold': 'y', 'Article bar code': 'article'})` should be: `df2 = df.rename(columns={'Date of the document': 'ds','Quantity sold': 'y', 'Article bar code': 'article'})` no? – Mysterio Apr 25 '19 at 06:47
  • Did you managed to convert the prediction dictionary to a dataframe? Do you know of a solution to do this? – George C. Serban Aug 23 '21 at 12:10
0

I know this is old, but I faced a similar problem and this worked for me:

df = pd.read_csv('file.csv')
df = pd.DataFrame(df)
df = df.rename(columns={'Date of the document': 'ds', 'Quantity sold': 'y', 'Article bar code': 'Article'})
#I filter first Articles bar codes with less than 3 records to avoid errors as prophet only works for 2+ records by group
df = df.groupby('Article').filter(lambda x: len(x) > 2)

df.Article = df.Article.astype(str)

final = pd.DataFrame(columns=['Article','ds','yhat'])

grouped = df.groupby('client_id')
for g in grouped.groups:
    group = grouped.get_group(g)
    m = Prophet()
    m.fit(group)
    future = m.make_future_dataframe(periods=365)
    forecast = m.predict(future)
    #I add a column with Article bar code
    forecast['Article'] = g
    #I concad all results in one dataframe
    final = pd.concat([final, forecast], ignore_index=True)

final.head(10)
Irene
  • 111
  • 2