0

I am using statsmodels.tsa.arima_model.ARIMA, and I took the square root transform of the endogenous variable before plugging it into the algorithm. The model uses a differencing order of 1:

model = ARIMA(sj_sqrt, order=(2, 1, 0))

After fitting the model and grabbing the predictions, I want to put the predictions back in the original form for comparison with the original data. However, I can't seem to transform them back correctly.

To replicate a simple version of this problem, here is some code:

#original data:
test = pd.Series([1,1,1,50,1,1,1,1,1,1,1,1,40,1,1,2,1,1,1,1,1]) 

#sqrt transformed data:
test_sqrt = np.sqrt(test)

#sqrt differenced data:
test_sqrt_diff = test_sqrt.diff(periods=1)

#undo differencing:
test_sqrt_2 = cumsum(test_sqrt_diff)

#undo transformations:
test_2 = test_sqrt_2 ** 2

f, axarr = plt.subplots(5, sharex=True, sharey=True)
axarr[0].set_title('original data:')
axarr[0].plot(test)
axarr[1].set_title('sqrt transformed data:')
axarr[1].plot(test_sqrt)
axarr[2].set_title('sqrt differenced data:')
axarr[2].plot(test_sqrt_diff)
axarr[3].set_title('differencing undone with .cumsum():')
axarr[3].plot(test_sqrt_2)
axarr[4].set_title('transformation undone by squaring:')
axarr[4].plot(test_2)
f.set_size_inches(5, 12)

You can see from the graphs that the undifferenced, untransformed data is not quite on the same scale. test[3] returns 50, and test_2[3] returns 36.857864376269056enter image description here

Renel Chesak
  • 577
  • 6
  • 16

1 Answers1

0

Solution:

## original
x = np.array([1,1,1,50,1,1,1,1,1,1,1,1,40,1,1,2,1,1,1,1,1])

## sqrt
x_sq = np.sqrt(x)

## diff
d_sq = np.diff(x_sq,n=1)

## Only works when d = 1
def diffinv(d,i):
    inv = np.insert(d,0,i)
    inv = np.cumsum(inv)
    return inv

## inv diff 
y_sq = diffinv(d_sq,x_sq[0])

## Check inv diff
(y_sq==x_sq).all()
Renel Chesak
  • 577
  • 6
  • 16