-1

I'm working with data that is not normal distributed. I have applied the common methods: logs and square roots in order to transform the data and then treat it with an ARIMA model so I can make a forecast.

What I have tried is:

set.seed(123)
y<-rexp(200)

yl<-log(y+1)
shapiro.test(yl)

trans<-(y-mean(y))/sd(y)
shapiro.test(trans)

This methods are failing the test of normality, I would like to ask if there are another options to transform data into normal data in R.

Rob Mensching
  • 33,834
  • 5
  • 90
  • 130
Michelle
  • 202
  • 2
  • 14
  • This answer is an interesting read: http://stackoverflow.com/questions/7781798/seeing-if-data-is-normally-distributed-in-r/7788452#7788452 – Arun Apr 01 '13 at 16:20
  • Thanks, but that reading is related to normality tests. I can apply Jarque Bera test and the results will still have the same accordance. – Michelle Apr 01 '13 at 16:27
  • Yes, the point is on the validity of normality tests and their assumptions. That is, you're testing for a "good" normalisation based on tests whose hypothesis are about rejecting the assumption of normality. – Arun Apr 01 '13 at 16:29
  • 1
    As well as @Arun's comments which are extremely relevant, you should be testing the **residuals** from your model, not the raw data. You are currently implementing a procedure that will produce statistical garbage. – hadley Apr 01 '13 at 21:50

1 Answers1

1

You can try the forecast package, with has the BoxCox.lambda function to handle BoxCox transformations. The scale/re-scale is done automatically. Example:

 require(forecast)
 y <- ts(rnorm(120,0,3) + 20*sin(2*pi*(1:120)/12), frequency=12) + runif(120)
 lambda <- BoxCox.lambda(y) # should check if the transformation is necessary
 model <- auto.arima(y, lambda = lambda)
 plot(forecast(model))
Fernando
  • 7,785
  • 6
  • 49
  • 81