I am using statsmodels.tsa.arima_model.ARIMA
, and I took the square root transform of the endogenous variable before plugging it into the algorithm. The model uses a differencing order of 1:
model = ARIMA(sj_sqrt, order=(2, 1, 0))
After fitting the model and grabbing the predictions, I want to put the predictions back in the original form for comparison with the original data. However, I can't seem to transform them back correctly.
To replicate a simple version of this problem, here is some code:
#original data:
test = pd.Series([1,1,1,50,1,1,1,1,1,1,1,1,40,1,1,2,1,1,1,1,1])
#sqrt transformed data:
test_sqrt = np.sqrt(test)
#sqrt differenced data:
test_sqrt_diff = test_sqrt.diff(periods=1)
#undo differencing:
test_sqrt_2 = cumsum(test_sqrt_diff)
#undo transformations:
test_2 = test_sqrt_2 ** 2
f, axarr = plt.subplots(5, sharex=True, sharey=True)
axarr[0].set_title('original data:')
axarr[0].plot(test)
axarr[1].set_title('sqrt transformed data:')
axarr[1].plot(test_sqrt)
axarr[2].set_title('sqrt differenced data:')
axarr[2].plot(test_sqrt_diff)
axarr[3].set_title('differencing undone with .cumsum():')
axarr[3].plot(test_sqrt_2)
axarr[4].set_title('transformation undone by squaring:')
axarr[4].plot(test_2)
f.set_size_inches(5, 12)
You can see from the graphs that the undifferenced, untransformed data is not quite on the same scale. test[3]
returns 50
, and test_2[3]
returns 36.857864376269056