R Tidyverse - Identify proportion of select columns meeting criteria

Question

I have data like this:

x1 = seq(0, 2, length=5)
x2 = seq(1, 2, length=5)
x3 = seq(0, 1, length=5)
df = data.frame(rbind(x1,x2,x3))

I would like to obtain the proportion of specific columns (based on the name) that have a value less than 1. The following selects the variables that contain "x" in the name and sums across the values in the columns.

df <- df %>% 
  mutate(sumVar = rowSums(select(., contains("x")), na.rm = TRUE))

Is there a way to include ifelse logic within this setup to determine the proportion of columns with values < 1 (as opposed to calculating the sum as i have here)? I'm using the contains feature as I want to calculate this across a larger number of columns that are not necessarily in order, but have the same pattern in their name.

score 0 · Accepted Answer · answered Apr 17 '20 at 16:08

0

You can use rowMeans() on the condition:

library(dplyr)

df %>% 
  mutate(propVar = rowMeans(select(., contains("x")) < 1))

   x1   x2   x3   propVar
1 0.0 1.00 0.00 0.6666667
2 0.5 1.25 0.25 0.6666667
3 1.0 1.50 0.50 0.3333333
4 1.5 1.75 0.75 0.3333333
5 2.0 2.00 1.00 0.0000000

answered Apr 17 '20 at 16:08

Ritchie Sacramento

29,890
4
48
56

As a follow-up, can i simply do the following if i want the to only calculate the mean across variables that are both > 0 and < 1? df %>% mutate(propVar = rowMeans(select(., contains("x")) < 0 & select(., contains("x")) < 1)) – Jason Schoeneberger Apr 22 '20 at 20:01
Yes - although there's a typo in the code (`<` is used twice). `df %>% mutate(propVar = rowMeans(select(., starts_with("X")) > 0 & select(., starts_with("X")) < 1))`. – Ritchie Sacramento Apr 22 '20 at 22:12
Yep...got it! Thanks for the confirmation! – Jason Schoeneberger Apr 23 '20 at 12:08

score 0 · Answer 2 · answered Apr 17 '20 at 17:29

0

We can use rowMeans in base R

df$propVar <- rowMeans(df[startsWith(names(df), "x")]<1)

answered Apr 17 '20 at 17:29

akrun

874,273
37
540
662

R Tidyverse - Identify proportion of select columns meeting criteria

2 Answers2