modify_at to remove NA values in each element in a list

Question

I have a big list of small datasets like this:

>> my_list

[[1]]
# A tibble: 6 x 2
   Year FIPS 
  <dbl> <chr>
1  2015 12001
2  2015 51013
3  2015 12081
4  2015 12115
5  2015 12127
6  2015 42003

[[2]]
# A tibble: 9 x 2
   Year FIPS 
  <dbl> <chr>
1  2017 04013
2  2017 10003
3  2017 NA   
4  2017 25005
5  2017 25009
6  2017 25013
7  2017 25017
8  2017 25021
9  2017 25027

...

I want to remove the NAs from each tibble using modify_at because looks like is a clean way to do it. This is my try:

my_list %>% modify_at(c("FIPS"), drop_na)

I tried also with na.omit, but I get the same error in both cases:

Error: character indexing requires a named object

Can anyone help me here, please? What I'm doing wrong?

You're applying `drop_na` to the list itself, not the elements of the list. Try `lapply(my_list, drop_na)`. — Limey, Jul 07 '21 at 14:45
@Limey it works so smooth... thanks! So, there is no option to use `modify_at` in this case? — Ariel, Jul 07 '21 at 14:50
I think `modify_at()` would only be useful here combined with something like `modify_depth()` so you work within each data.frame instead of at the whole list level. That seems complicated compared to a `lapply()` or `map()` loop. :) — aosmith, Jul 07 '21 at 14:53
@Limey is right, you apply `drop_na` on the whole data set since it removes an entire row containing `NA` values. So `modify_at` is of no use here. But you can use `mylist %>% modify(drop_na)` instead. — Anoushiravan R, Jul 07 '21 at 15:00

score 0 · Accepted Answer · answered Jul 07 '21 at 15:36

Creating some data.

library(tidyverse)
mylist <-
  list(tibble(a = c(1, 2, NA),
              b = c(2, 2, 2)),
       tibble(c = rep(1, 5),
              d = sample(c(NA, 2), 5, replace = TRUE)))

The .at argument in purrr::modify_at() specifies the list element to modify, not the column within the dataframe nested in the list. purrr::modify() works for your purposes.

modify(mylist, drop_na)
#> [[1]]
#> # A tibble: 2 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2
#> 2     2     2
#> 
#> [[2]]
#> # A tibble: 4 x 2
#>       c     d
#>   <dbl> <dbl>
#> 1     1     2
#> 2     1     2
#> 3     1     2
#> 4     1     2

purrr::map() also works. Since your input and output are both list objects, map() is sufficient here, while modify() would be preferred if your input is of another class than a regular list and you want to conserve that class attribute for the output.

map(mylist, drop_na)
#> [[1]]
#> # A tibble: 2 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2
#> 2     2     2
#> 
#> [[2]]
#> # A tibble: 4 x 2
#>       c     d
#>   <dbl> <dbl>
#> 1     1     2
#> 2     1     2
#> 3     1     2
#> 4     1     2

base R

lapply(mylist, na.omit)
#> [[1]]
#> # A tibble: 2 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2
#> 2     2     2
#> 
#> [[2]]
#> # A tibble: 4 x 2
#>       c     d
#>   <dbl> <dbl>
#> 1     1     2
#> 2     1     2
#> 3     1     2
#> 4     1     2

modify_at to remove NA values in each element in a list

1 Answers1