I'm working with a nested list (ground sensors) of lists (measurement depths) of lists (data frames for each year from 2014-2018) and I want to perform linear interpolation for each of those data frames. Here is an overview of the dataset, just so you can get an idea what it looks like:
str(G1OUT_gwFERN)
$ SE13 :List of 3
..$ d20:List of 5
.. ..$ 2014:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
.. .. ..$ SWC : num [1:8760] 46 45.9 46 45.9 45.9 ...
.. ..$ 2015:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
.. .. ..$ SWC : num [1:8760] 49.8 49.8 49.8 49.8 49.8 ...
.. ..$ 2016:'data.frame': 8784 obs. of 2 variables:
.. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
.. .. ..$ SWC : num [1:8784] 48.2 48.2 48.1 48.1 48.1 ...
.. ..$ 2017:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
.. .. ..$ SWC : num [1:8760] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
.. ..$ 2018:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
.. .. ..$ SWC : num [1:8760] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
..$ d50:List of 5
.. ..$ 2014:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
.. .. ..$ SWC : num [1:8760] 35.2 35.2 35.2 35.2 35.2 ...
.. ..$ 2015:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
.. .. ..$ SWC : num [1:8760] 34.8 34.8 34.7 34.7 34.8 ...
.. ..$ 2016:'data.frame': 8784 obs. of 2 variables:
.. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
.. .. ..$ SWC : num [1:8784] 34.2 34.2 34.1 34.1 34.1 ...
.. ..$ 2017:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
.. .. ..$ SWC : num [1:8760] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
.. ..$ 2018:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
.. .. ..$ SWC : num [1:8760] 36.4 36.4 36.3 36.3 36.3 ...
..$ d5 :List of 5
.. ..$ 2014:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
.. .. ..$ SWC : num [1:8760] 32.5 32.4 32.4 32.4 32.4 ...
.. ..$ 2015:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
.. .. ..$ SWC : num [1:8760] 32.1 32.1 32.1 32.1 32.1 ...
.. ..$ 2016:'data.frame': 8784 obs. of 2 variables:
.. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
.. .. ..$ SWC : num [1:8784] 30.3 30.3 30.3 30.2 30.2 ...
.. ..$ 2017:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
.. .. ..$ SWC : num [1:8760] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
.. ..$ 2018:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
.. .. ..$ SWC : num [1:8760] 31.1 31.2 31.1 31.1 31.1 ...
$ SE14 :List of 3
..$ d20:List of 5
.. ..$ 2014:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
.. .. ..$ SWC : num [1:8760] 52.5 52.5 52.5 52.5 52.4 ...
.. ..$ 2015:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
.. .. ..$ SWC : num [1:8760] 53.7 53.7 53.7 53.7 53.7 ...
.. ..$ 2016:'data.frame': 8784 obs. of 2 variables:
.. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
.. .. ..$ SWC : num [1:8784] 52.3 52.2 52.3 52.3 52.2 ...
.. ..$ 2017:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
.. .. ..$ SWC : num [1:8760] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
.. ..$ 2018:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
.. .. ..$ SWC : num [1:8760] 55 55 55 55.1 55 ...
..$ d50:List of 5
.. ..$ 2014:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
.. .. ..$ SWC : num [1:8760] 27.9 27.9 27.9 27.9 27.9 ...
.. ..$ 2015:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
.. .. ..$ SWC : num [1:8760] 28.5 28.5 28.5 28.5 28.5 ...
.. ..$ 2016:'data.frame': 8784 obs. of 2 variables:
.. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
.. .. ..$ SWC : num [1:8784] 26.7 26.7 26.7 26.6 26.7 ...
.. ..$ 2017:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
.. .. ..$ SWC : num [1:8760] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
.. ..$ 2018:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
.. .. ..$ SWC : num [1:8760] 29.4 29.4 29.4 29.4 29.5 ...
..$ d5 :List of 5
.. ..$ 2014:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
.. .. ..$ SWC : num [1:8760] 39.8 39.8 39.7 39.6 39.7 ...
.. ..$ 2015:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
.. .. ..$ SWC : num [1:8760] 42.2 42.3 42.3 42.3 42.3 ...
.. ..$ 2016:'data.frame': 8784 obs. of 2 variables:
.. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
.. .. ..$ SWC : num [1:8784] 36.6 36.6 36.5 36.6 36.5 ...
.. ..$ 2017:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
.. .. ..$ SWC : num [1:8760] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
.. ..$ 2018:'data.frame': 8760 obs. of 2 variables:
.. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
.. .. ..$ SWC : num [1:8760] 56.5 56.5 56.5 56.5 56.3 ...
I extracted a part of one of the dataframes from the list and used dput()
so you have some toy data to work with:
toydat <- structure(list(Date = structure(c(16277, 16277, 16277, 16277,
16277, 16277, 16277, 16277, 16277, 16277, 16277, 16277, 16277,
16277, 16277, 16277, 16277, 16277, 16277, 16278, 16278, 16278,
16278, 16278, 16278, 16278, 16278, 16278, 16278, 16278, 16278,
16278, 16278, 16278, 16278, 16278, 16278, 16278, 16278, 16278,
16278, 16278, 16278, 16279, 16279, 16279, 16279, 16279, 16279,
16279, 16279, 16279, 16279, 16279, 16279, 16279, 16279, 16279,
16279, 16279, 16279, 16279, 16279, 16279, 16279, 16279, 16279,
16280, 16280, 16280, 16280, 16280, 16280, 16280, 16280, 16280,
16280, 16280, 16280, 16280, 16280, 16280, 16280, 16280, 16280,
16280, 16280, 16280, 16280, 16280, 16280, 16281, 16281, 16281,
16281, 16281, 16281, 16281, 16281, 16281, 16281), class = "Date"),
SWC = c(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 19.627243, 19.543659,
19.593796, 19.534379, 19.59937, 19.51582, 19.482441, 19.51582,
19.571497, 19.645825, 20.83435, 21.116572, 22.688702, 22.216629,
21.54243, 21.229946, 21.003335, 20.833735, 20.74902, 20.608045,
20.512311, 20.411049)), row.names = 48774:48874, class = "data.frame")
The measurements were taken hourly, so I have 24 measurements for one day. Some values in the dataframes are NoData values though, so I want to fill these gaps using linear interpolation. However, I only want to use linear interpolation if the gap between the NoData values and actual values is not bigger than 2 days. In regard to the toy data this means that if the values are missing for the 28th and 29th of July (2014-07-28 and 2014-07-29) I only want to fill the gaps for these days and not for the 27th, 26th, 25th... July and so on. If the gaps are larger than 2 days I want to keep the NoData values as I'll use linear regression later on to fill these gaps but that should not be topic of this post.
I already tried the following things:
I used the na.approx()
function from the zoo
package. I typed:
na.approx(toydat$SWC, na.rm = FALSE)
but that just returns the data as it was before and does not interpolate (I typed $SWC because I only want to interpolate that column). I figured that if I add rule = 2
to the code that it takes the last value after the NaN values and just puts that value for all of the NaN values which is not what I want. I also tried using maxgap = 48
because I thought that would make sure that only 48 values get interpolated. However, as I couldn't manage to interpolate properly anyway, nothing happened.
I would really appreciate some help.