R Fill NA with last value for a max of n times

Question

There are multiple ways to fill missing values in R. However, I can't find a solution for filling just the last n NAs.

Available options:

na_vector <- c(1, NA, NA, NA, 2, 3, NA, NA)

library(zoo)

na.locf(na_vector)
# Outputs: [1] 1 1 1 1 2 3 3 3

na.locf0(na_vector, maxgap = 2)
# Outputs: [1] 1 NA NA NA  2  3  3  3

How I would like it to be:

na_vector <- c(1, NA, NA, NA, 2, 3, NA, NA)

fill_na <- function(vector, n){
   ...
}

fill_na(na_vector, n = 1)
# Outputs: [1] 1 1 NA NA  2  3  3  NA

fill_na(na_vector, n = 2)
# Outputs: [1] 1 1 1 NA  2  3  3  3

Your examples don’t match your description of “filling the last n NAs”. The last value for `fill_na(na_vector, n = 1)` is still `NA`. Can you clarify? — zephryl, Nov 10 '22 at 23:22
It looks like you actually want to replace the *first* n within each run of `NA`s. Is that right? — zephryl, Nov 10 '22 at 23:24

Santiago · Accepted Answer · 2022-11-10T23:51:55.120

1

Here is an option to get those outputs using dplyr and recursion:

na_vector <- c(1, NA, NA, NA, 2, 3, NA, NA)

fill_na <- function(vector, n){
  if (n == 0) {
    vector
  } else {
    fill_na(
      vector = dplyr::coalesce(vector, dplyr::lag(vector)),
      n = n - 1
    )
  }
}

fill_na(na_vector, n = 1)
# [1]  1  1 NA NA  2  3  3 NA

fill_na(na_vector, n = 2)
# [1]  1  1  1 NA  2  3  3  3

edited Nov 10 '22 at 23:51

answered Nov 10 '22 at 23:37

Santiago

641
3
14

G. Grothendieck · Answer 2 · 2022-11-11T13:17:52.713

Number the NA's in each consecutive run of NA's giving a and then only fill in those with a number less than or equal to n. This uses only vector operations internally and no iteration or recursion.

library(collapse)
library(zoo)

fill_na <- function(x, n) {
  a <- ave(x, groupid(is.na(x)), FUN = seq_along)
  ifelse(a <= n, na.locf0(x), x)
}

fill_na(na_vector, 1)
## [1]  1  1 NA NA  2  3  3 NA
fill_na(na_vector, 2)
## [1]  1  1  1 NA  2  3  3  3

score 0 · Answer 3 · answered Dec 17 '22 at 07:48

Here is a solution to impute everything except the last n NA's based on base R + imputeTS.

library(imputeTS)
na_vector <- c(1, NA, NA, NA, 2, 3, NA, NA)

# The function that allows imputing everything except the last n NAs
fill_except_last_n_na <- function(x, n) {
  index <- which(rev(cumsum(rev(as.numeric(is.na(x))))) == n+1)
  x[1:tail(index,1)] <- na_locf(x[1:tail(index,1)])
  return(x)
}

# Call the new function
fill_except_last_n_na(na_vector,2)

## Result
[1]  1  1  1  1  2  3 NA NA

When you want to use another imputation option than last observation carried forward, you can just replace the na_locf with na_ma (moving average), na_interpolation (interpolation), na_kalman (Kalman Smooting on State Space Models) or other imputation function provided by the imputeTS package (see also in the imputeTS documentation for a list of functions.

R Fill NA with last value for a max of n times

3 Answers3