5

Given an R list, I wish to find the index of a given list entry. For example, for entry "36", I want my output to be "2". Also, how could I do such queries in parallel using lapply?

> list

$`1`
[1] "7"  "12" "26" "29"

$`2`
[1] "11" "36"

$`3`
[1] "20" "49"

$`4`
[1] "39" "41"
Community
  • 1
  • 1
SAT
  • 169
  • 1
  • 2
  • 8

2 Answers2

11

Here's a one-liner that allows for the (likely?) possibility that more than one element of the list will contain the string for which you're searching:

## Some example data
ll <- list(1:4, 5:6, 7:12, 1:12)
ll <- lapply(ll, as.character)

which(sapply(ll, FUN=function(X) "12" %in% X))
# [1] 3 4
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • 1
    Thanks so much! R can be a bit of a headache at the beginning. – SAT Apr 02 '12 at 18:08
  • 1
    @Josh: shouldn't your example return `[1] 3 4` ? – Carl Witthoft Apr 02 '12 at 18:27
  • 1
    @CarlWitthoft -- Yep, thanks. (I did a quick code edit early on but apparently neglected to change the results bit.) Fixed it now. Also, do feel free to edit things like that yourself (at least in any of my posts)! – Josh O'Brien Apr 02 '12 at 18:33
  • This works, but why do you convert it to character first? For me it worked with just `which(sapply(ll, FUN=function(X) 12 %in% X))` – mikey Mar 08 '21 at 13:59
  • 1
    @mikey The OP's question was about a list of character vectors, so it looks like I created an example to match theirs. And yeah, it'll work much more generally. – Josh O'Brien Mar 08 '21 at 15:16
  • @JoshO'Brien. Thanks, I wasn't looking that closely at the question. – mikey Mar 08 '21 at 16:08
3

You could first turn your list into a data.frame that maps values to their corresponding index in the list:

ll <- list(c("7", "12", "26", "29"),
           c("11", "36"),
           c("20", "49"),
           c("39", "41"))

df <- data.frame(value = unlist(ll),
                 index = rep(seq_along(ll), lapply(ll, length)))
df
#    value index
# 1      7     1
# 2     12     1
# 3     26     1
# 4     29     1
# 5     11     2
# 6     36     2
# 7     20     3
# 8     49     3
# 9     39     4
# 10    41     4

Then, write a function using matchfor finding the index of the first occurrence of a given value:

find.idx <- function(val)df$index[match(val, df$value)]

You can call this function on a single value, or many at a time since match is vectorized:

find.idx("36")
# [1] 2
find.idx(c("36", "41", "99"))
# [1]  2  4 NA

Of course, you can also run it through lapply, especially if you plan to run it in parallel:

lapply(c("36", "41", "99"), find.idx)
# [[1]]
# [1] 2
# 
# [[2]]
# [1] 4
# 
# [[3]]
# [1] NA

For running this last bit in parallel, there are many, many options. I would recommend you weigh your options by searching through http://cran.r-project.org/web/views/HighPerformanceComputing.html.

flodel
  • 87,577
  • 21
  • 185
  • 223