0

I am using R to perform my analyses.

I have five datasets: 365 rainfall measurements in a year; 4 x 365 height values in runoff drains located close to each other in the same year.

The graph plots of the drain data exhibit very similar patterns. The majority of the differences in the patterns are due to missing values randomly distributed arising from equipment failure.

The rainfall data seem broadly similar to the drain data. Where there are spikes in rainfall there are spikes in the drain data. I do not know as yet if there are any lags between rainfall and drain data.

Parallel plot of Precipitation and Drain Data:

enter image description here

I have two questions that involve determining the similarity between the time series of aperiodic data.

  1. By which criteria may I justifiably merge the drain datasets to "fill in" the missing data?

  2. How may I compare the rainfall data to the drain data so that I may say to what extent the one caused the other?

I have briefly experimented with using arima() in R but I don't understand how to use the output.

I have a suspicion I should be using the grangertest() function, but I don't know how to implement it on two or more datasets.

Any assistance or advice would be sincerely appreciated.

Sinval
  • 1,315
  • 1
  • 16
  • 25
Peter Wade
  • 11
  • 3
  • 1
    Welcome to the site! However, I think your question is more suited for https://stats.stackexchange.com/questions since you're asking statistic-theoretical questions and not something about coding specifically. Maybe split your question into 2: the pure coding part goes here, the theoretical part goes to stackexchange. The `arima()` command simply fits an ARIMA model that you specifcy to the data and returns a model datatype, with information of the fitted values such as coefficient estimates, standard errors etc. similar to an `lm()` output. – PaulG Mar 20 '21 at 18:41
  • I think you would want to compute cross-correlation functions, although doing that in the presence of NA values might require some slightly fancier code than the built-in `ccf()` function ... – Ben Bolker Mar 20 '21 at 20:12
  • Thank you Sinval, Ben and Paul. I have reposted my question in the stats.stackexchange.com/questions site, without the reference to the ARIMA model, which I believe involves detection of autocorrelation, which is explicitly absent in my stochastic dataset. – Peter Wade Mar 21 '21 at 09:38

0 Answers0