1

I want to extract a vector containing the names of all the variables whose values (not names themselves) contain a specific string.

For example:

> dat
  Name Mark1 Mark2 Mark3
1    A   67%   61%    87
2    B   98%   83%    26
3    C   42%   62%    98
4    D   83%   32%    36
5    E   40%   90%    80
6    F   89%   25%    44

From the data frame above, I want the variable names whose values contain the '%' sign. As of now, I have been using a for-loop to do that, but it seems like a long way to do a simple task.

> prct <- c()
> for (i in 1:ncol(dat)){
    if (any(grepl("%", dat[,i]) == T)){
      prct <- c(prct, colnames(dat)[i])
    }
  }
> prct
[1] "Mark1" "Mark2"

2 Answers2

2

If every value in Mark1 and Mark2 contains a % we can check only the first row:

colnames(df)[grepl('%', df[1,])]
[1] "Mark1" "Mark2"

Otherwise, you can use apply with MARGIN = 2 to apply this function to each column and return a named logical vector:

apply(df, 2, function(x) any(grepl('%', x)))
 Name Mark1 Mark2 Mark3 
FALSE  TRUE  TRUE FALSE

If you just want the variable names, use this logical vector to subset colnames(df):

colnames(df)[apply(df, 2, function(x) any(grepl('%', x)))]
[1] "Mark1" "Mark2"
divibisan
  • 11,659
  • 11
  • 40
  • 58
0

With tidyverse:

df<-read.table(text=
"  Name Mark1 Mark2 Mark3
1    A   67%   61%    87
2    B   98%   83%    26
3    C   42%   62%    98
4    D   83%   32%    36
5    E   40%   90%    80
6    F   89%   25%    44",h=TRUE)

f <- function(x) any(str_detect(x,"%"))
df %>% select_if(f) %>% colnames

#[1] "Mark1" "Mark2"

Or:

df %>% select_if(funs(any(str_detect(.,"%")))) %>% colnames
Nicolas2
  • 2,170
  • 1
  • 6
  • 15