1

Suppose I have the following data frame:

df <- data.frame(A = c(1, 2, 3), B = c("a", "b", "c"), C = c(4, 5, 6))

  A B C
1 1 a 4
2 2 b 5
3 3 c 6

If I wanted to know the position of a column e.g. column B, then I can use:

which(names(df)=="B")

Or

grep("B", names(df))

In both cases, I get 2, but what if I wanted to know the positions of columns A and C at the same time? That is, I want to enter a vector of column names, and get a vector of their positions. So, if I entered "A", "C", the result should be 1 3.

The two above examples I've used don't seem to work when entering a vector of column names instead of a single one.

I know I can do this with loops, but is there a method that achieves better performance?

Yang Li
  • 462
  • 1
  • 8
  • 21

2 Answers2

6

No *apply/loop needed. You need match. See the doc at ?match. For instance:

match(c("A","C"),names(df))
#[1] 1 3

Other *apply/loops solutions are way worse performance-wise.

nicola
  • 24,005
  • 3
  • 35
  • 56
  • Accepted, thanks! Regarding the dupe, I don't think "find positions for a list of column names in a data frame" is the same as "find *the* position of an element in a vector, Bonus: find multiple" - Just because they end up using the same function. Especially because this question also involves a performance-comparing aspect between `match` and loops – Yang Li Jan 18 '17 at 22:40
  • It's exactly the same: the `v` vector of the other question is represented here by `names(df)` (which, by the way, *is* a vector). You want to find the positions of the elements of a vector in another; the fact that the second vector is represented by the column names of a `data.frame` is totally irrelevant. – nicola Jan 18 '17 at 22:44
  • I respectfully disagree, but won't bother disputing it. Also, FWIW, I was watching and you're answer was earlier than @DavidArenburg's (now deletd) comment which contained the same answer, so you don't need to worry about that :) – Yang Li Jan 18 '17 at 22:55
1

consider sapply() as internally a for loop; which iterates through the list of columnnames and then applies grep/which

sapply(vector.of.columns, function(x) which(names(df) == x))
joel.wilson
  • 8,243
  • 5
  • 28
  • 48