-1

I have an excel dataset of 24-hour moving averages for PM10 air pollution concentration levels, and need to obtain the individual hourly readings from them. The moving average data is updated every hour, so at hour t, the reading is the average of the 24 readings from t-23 to t hours, and at hour t+1, the reading is the average of t-22 to t+1, etc. I do not have any known data points to extrapolate from, just the 24-hour moving averages.

Is there any way I can obtain the individual hourly readings for time t, t+1, etc, from the moving average?

The dataset contains data over 3 years, so with 24 readings a day (at every hour), the dataset has thousands of readings.

I have tried searching for a possible way to implement a simple excel VBA code to do this, but come up empty. Most of the posts I have seen on Stackoverflow and stackexchange, or other forums, involve calculating moving averages from discrete data, which is the reverse of what I need to do here.

The few I have seen involve using matrices, which I am not very sure how to implement. (https://stats.stackexchange.com/questions/67907/extract-data-points-from-moving-average) (https://stats.stackexchange.com/questions/112502/estimating-original-series-from-their-moving-average)

Any suggestions would be greatly appreciated!

Pᴇʜ
  • 56,719
  • 10
  • 49
  • 73
Andrew
  • 1
  • 1

1 Answers1

0

Short answer: you can't.

Consider a moving average on 3 points. And even consider we multiply each MA term by 3, so we really have sums of consecutive

Data: a b c d e f g
MA    a+b+c
        b+c+d
          c+d+e
            d+e+f
              e+f+g

With initial values, you can do something. To find the value of d, you would need to know b+c, hance to know a (since a+b+c is known). Then to find e, you know c+d+e and d, so you must find c, and since a is already needed, you will need also need b.

More generally, for a MA of length n, if you know the first n-1 values (hence also the nth, since you know the sum), then you can find all subsequent values. You can also start from the end. But basically, if you don't have enough original data, you are lost: there is a 1-1 relation between the n-1 first values of your data and the possible MA series. If you don't have enough information, there are infinitely many possibilities, and you can't decide which one is right.

Here I consider the simplest MA where the coefficient of each variable is 1/n (hence you compute the sum and divide by n). But this would apply to any MA, with slightly more complexity to account for different coefficients for each term in the sum.