1

My problem is pretty simple, and I know I'm missing something very obvious, I just can't figure out what it is....

My test predictions for Holt-Winters are coming out as NaN and I can't figure out why. Can anyone help on this?

I'm using a Jupyter Notebook, and trying to forecast sales of one SKU using Holt-Winters method. I even went as far as using

Here is the code I used:

# Import the libraries needed to execute Holt-Winters

import pandas as pd
import numpy as np
%matplotlib inline

df = pd.read_csv('../Data/M1045_White.csv',index_col='Month',parse_dates=True)

# Set the month column as the index column

df.index.freq = 'MS'
df.index

df.head()

df.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 48 entries, 2015-05-01 to 2019-04-01
Freq: MS
Data columns (total 7 columns):
Sales       48 non-null int64
EWMA12      48 non-null float64
SES12       47 non-null float64
DESadd12    47 non-null float64
DESmul12    47 non-null float64
TESadd12    48 non-null float64
TESmul12    12 non-null float64
dtypes: float64(6), int64(1)
memory usage: 3.0 KB

from statsmodels.tsa.holtwinters import SimpleExpSmoothing

# Train Test Split

train_data = df.iloc[:36] # Goes up to but not including 36
test_data = df.iloc[12:]

# Fit the Model

fitted_model = exponentialSmoothing(train_data['Sales'],trend='mul',seasonal='mul',seasonal_periods=12).fit()

test_predictions = fitted_model.forecast(12).rename('HW M1045 White Forecast')

test_predictions

Here is the output of my predictions:

2018-05-01   NaN
2018-06-01   NaN
2018-07-01   NaN
2018-08-01   NaN
2018-09-01   NaN
2018-10-01   NaN
2018-11-01   NaN
2018-12-01   NaN
2019-01-01   NaN
2019-02-01   NaN
2019-03-01   NaN
2019-04-01   NaN
Freq: MS, Name: HW M1045 White Forecast, dtype: float64

Can someone please point out what I may have missed? This seems to be a simple problem with a simple solution, but it's kicking my butt.

Thanks!

Suhas_Pote
  • 3,620
  • 1
  • 23
  • 38
Quantum Prophet
  • 337
  • 2
  • 8
  • 2
    Please take a moment to write properly the code. – OSainz May 28 '19 at 23:16
  • Were you able to resolve this issue? I am also getting the same error. – Kriti Pawar Aug 26 '22 at 17:32
  • yes, ensure that there are no blank cells and that the data is an integer. Also, if there is an N/A or something like that, it could throw you this error as well. I did not do as good of a job as I should have in terms of data clearning – Quantum Prophet Aug 30 '22 at 19:59

3 Answers3

2

The answer has something to do with the seasonal_periods variable being set to 12. If this is updated to 6 then the predictions yield actual values. I'm not a stats expert in Exponential Smoothing to understand why this is the case.

finianoneill
  • 121
  • 5
2

Reason:

Your training data contained some NaNs, so it was unable to model nor forecast.

See the non-null values count for each column, it is not the same.

df.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 48 entries, 2015-05-01 to 2019-04-01
Freq: MS
Data columns (total 7 columns):
Sales       48 non-null int64
EWMA12      48 non-null float64
SES12       47 non-null float64
DESadd12    47 non-null float64
DESmul12    47 non-null float64
TESadd12    48 non-null float64
TESmul12    12 non-null float64
dtypes: float64(6), int64(1)
memory usage: 3.0 KB

Check if there are any missing values in dataframe

df.isnull().sum()

Solution:

In your case, missing value treatment is needed before training the model.

Suhas_Pote
  • 3,620
  • 1
  • 23
  • 38
0

Thanks all. My but there was a few blank cells, and N/A within my dataset that caused my code to throw me this error. My mistake not doing a better job with data cleaning. As well, I ensured my dates where formatted correctly and sales data should be integer.

Quantum Prophet
  • 337
  • 2
  • 8