Get the name of a list item created with purrr::map

Question

I retrieved a list of csv files with purrr::map and got a large list.

  csv_files <- list.files(path = data_path, pattern = '\\.csv$', full.names = TRUE)
  all_csv <- purrr::map(csv_files, readr::read_csv2)
  names(all_csv) <- gsub(data_path, "", csv_files)
  return all_csv

EDITED as suggested by @Spacedman

I further need to process each tibble/data frame separately within the process_csv_data function.

purrr::map(all_csv, process_csv_data)

How to retrieve the name of a single item in the large list without for loop?

Also, use `basename(csv_files)` to get the file name part of the path. `gsub` fails if `data_path` is `"."`, which it was when I tried this. — Spacedman, Oct 24 '17 at 12:06
@Spacedman Is it the reason for the downvote? As I said, I'm avoiding a for loop and therefore I shouldn't have an index to use the bracket operator [. — Yann, Oct 24 '17 at 12:07
I think you should say *within the process_csv_data function* for clarity. — Spacedman, Oct 24 '17 at 13:17

score 21 · Accepted Answer · answered Oct 24 '17 at 13:16

21

Use map2, as in this reproducible example:

> L = list(a=1:10, b=1:5, c=1:6)
> map2(L, names(L), function(x,y){message("x is ",x," y is ",y)})
x is 12345678910 y is a
x is 12345 y is b
x is 123456 y is c

the output of the list as x in the function gets a bit munged by message, but its the list element of L.

answered Oct 24 '17 at 13:16

Spacedman

92,590
12
140
224

13

`imap` was devised to make these usages of `map2` sexier, your answer can be simplified into : `imap(L,~message("x is ",.x," y is ",.y))` – moodymudskipper Oct 27 '17 at 09:16
see also `lmap`, that allows you to loop on `list-elements` (sublists of length 1) : `lmap(L,~ {message("x is ",.x[[1]]," y is ",names(.x));return(list(NULL))})` – moodymudskipper Oct 27 '17 at 09:22

score 6 · Answer 2 · answered Oct 24 '17 at 18:11

6

You can take advantage of purrr to keep all the data in a single, nested tibble. That way each csv and processed csv remains linked directly with the appropriate csv-name:

csv_files <- list.files(path = data_path, pattern = '\\.csv$', full.names = TRUE)

all_csv <- tibble(csv_files) %>% 
    mutate(data = map(csv_files, read_csv2),
    processed = map(data, process_csv_data),
    csv_files = gsub(data_path, "", csv_files)) %>%
    select(-data)

answered Oct 24 '17 at 18:11

David Klotz

2,401
1
7
16

1

This works fine but it's a bit hard to retrieve the data by `all_csv$processed$name_of_file`. Alternatively, you can create a list of tibbles that can be accessed directly: `all_csv <- list.files(path = data_path, pattern = "*.csv", full.names = TRUE) %>% map(read_csv) %>% setNames(csv_files)` So can get a file just by `all_csv$name_of_file` – Agile Bean Aug 08 '19 at 04:06

Get the name of a list item created with purrr::map

2 Answers2

Linked

Related