1

I have a big list of small datasets like this:

>> my_list

[[1]]
# A tibble: 6 x 2
   Year FIPS 
  <dbl> <chr>
1  2015 12001
2  2015 51013
3  2015 12081
4  2015 12115
5  2015 12127
6  2015 42003

[[2]]
# A tibble: 9 x 2
   Year FIPS 
  <dbl> <chr>
1  2017 04013
2  2017 10003
3  2017 NA   
4  2017 25005
5  2017 25009
6  2017 25013
7  2017 25017
8  2017 25021
9  2017 25027

...

I want to remove the NAs from each tibble using modify_at because looks like is a clean way to do it. This is my try:

my_list %>% modify_at(c("FIPS"), drop_na)

I tried also with na.omit, but I get the same error in both cases:

Error: character indexing requires a named object

Can anyone help me here, please? What I'm doing wrong?

Ariel
  • 395
  • 1
  • 14
  • Can you please share your data using `dput` function – Karthik S Jul 07 '21 at 14:43
  • 2
    You're applying `drop_na` to the list itself, not the elements of the list. Try `lapply(my_list, drop_na)`. – Limey Jul 07 '21 at 14:45
  • @Limey it works so smooth... thanks! So, there is no option to use `modify_at` in this case? – Ariel Jul 07 '21 at 14:50
  • 1
    I think `modify_at()` would only be useful here combined with something like `modify_depth()` so you work within each data.frame instead of at the whole list level. That seems complicated compared to a `lapply()` or `map()` loop. :) – aosmith Jul 07 '21 at 14:53
  • @Limey is right, you apply `drop_na` on the whole data set since it removes an entire row containing `NA` values. So `modify_at` is of no use here. But you can use `mylist %>% modify(drop_na)` instead. – Anoushiravan R Jul 07 '21 at 15:00

1 Answers1

0

Creating some data.

library(tidyverse)
mylist <-
  list(tibble(a = c(1, 2, NA),
              b = c(2, 2, 2)),
       tibble(c = rep(1, 5),
              d = sample(c(NA, 2), 5, replace = TRUE)))

The .at argument in purrr::modify_at() specifies the list element to modify, not the column within the dataframe nested in the list. purrr::modify() works for your purposes.

modify(mylist, drop_na)
#> [[1]]
#> # A tibble: 2 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2
#> 2     2     2
#> 
#> [[2]]
#> # A tibble: 4 x 2
#>       c     d
#>   <dbl> <dbl>
#> 1     1     2
#> 2     1     2
#> 3     1     2
#> 4     1     2

purrr::map() also works. Since your input and output are both list objects, map() is sufficient here, while modify() would be preferred if your input is of another class than a regular list and you want to conserve that class attribute for the output.

map(mylist, drop_na)
#> [[1]]
#> # A tibble: 2 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2
#> 2     2     2
#> 
#> [[2]]
#> # A tibble: 4 x 2
#>       c     d
#>   <dbl> <dbl>
#> 1     1     2
#> 2     1     2
#> 3     1     2
#> 4     1     2

base R

lapply(mylist, na.omit)
#> [[1]]
#> # A tibble: 2 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2
#> 2     2     2
#> 
#> [[2]]
#> # A tibble: 4 x 2
#>       c     d
#>   <dbl> <dbl>
#> 1     1     2
#> 2     1     2
#> 3     1     2
#> 4     1     2
Till
  • 3,845
  • 1
  • 11
  • 18