0

So my question doesn't sound quite right, but i don't how to formulate it more correctly.

I am trying to do the following. I have a dataframe which contains 3 columns, 2 of which can either have a 1 (yes) or a 2 (no). And 1 column that has either a 1,2 or 3. Now what I want to do is see which percentage has all 2 column on 1 and the 1 column on 1 or 2, which has all 2 column on 2 and the one column on 3, and which only has one of the 2 column on 1 and the one column on 1 or 2.

If you catch my drift?

for example df is:

    V1 V2 V3 
1   1  2  1  
2   2  2  2  
3   1  1  1  
4   2  2  1  
5   1  2  1  

Of the 5 rows 1 has all on 1 (yes), so 20% would be all 1, so i'd expect my outcome to be something like this:

all 1: 20%

Sample of data with dput(DT)

   structure(list(rom = c("2", "1", "2", "2", "2", "1", "2", "1", 
"2", "2", "1", "1", "2", "2", "2", "1", "2", "2", "2", "2", "2", 
"2", "2", "1", "2", "2", "2", "2", "1", "1", "2", "2", "2", "2", 
"2", "1", "1", "1", "1", "2", "2", "2", "1", "1", "2", "2", "1", 
"1", "2", "1", "1", "2", "2", "2", "2", "1", "1"), power = c("2", 
"2", "2", "1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "1", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "1", "2", "2", "2", "1", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2"), pain = c("2", "2", "3", "2", "2", "3", "2", 
"2", "2", "2", "3", "2", "3", "2", "3", "3", "3", "3", "3", "3", 
"2", "3", "3", "2", "3", "3", "2", "2", "2", "2", "3", "3", "3", 
"3", "2", "2", "3", "2", "2", "2", "3", "3", "2", "2", "3", "2", 
"2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "2")), row.names = c(NA, 
-57L), class = "data.frame")
Boudewijn
  • 49
  • 6

4 Answers4

0

If you have only 1's and 2's in your data you can calculate row-wise sum using rowSums and calculate the ratio of rows that have sum equal to number of columns in the dataframe.

mean(rowSums(df) == ncol(df)) * 100
#[1] 20

data

df <- structure(list(V1 = c(1L, 2L, 1L, 2L, 1L), V2 = c(2L, 2L, 1L, 
2L, 2L), V3 = c(1L, 2L, 1L, 1L, 1L)), class = "data.frame", 
row.names = c(NA, -5L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks! I also have one row where it can be either a 1 or a 2. Since that one is from 1 to 3 (1 yes+explanation, 2yes, 3no). Forgot to add that to my question, apologies – Boudewijn Aug 26 '20 at 08:37
  • If you want to exclude that column which has 1, 2 or 3 you can do it with `-` sign. Let's say column 5 is that column, you can do `mean(rowSums(df[-5]) == ncol(df) - 1) * 100` – Ronak Shah Aug 26 '20 at 08:46
0

perhaps this is what you are looking for?

DT <- data.table::fread("V1 V2 V3 
   1  2  1  
   2  2  2  
   1  1  1  
   2  2  1  
   1  2  1  ")


table( rowSums( DT * DT == 1 ) )  / nrow(DT)


#   0   1   2   3 
# 0.2 0.2 0.4 0.2 
Wimpel
  • 26,031
  • 1
  • 20
  • 37
  • if you want to include other numbers (like 2), use `table( rowSums( DT * (DT == 1 | DT == 2) ) ) / nrow(DT)` – Wimpel Aug 26 '20 at 08:40
  • Thanks for your answer, however when i do this i get the following error code: "Error in FUN(left, right) : non-numeric argument to binary operator" – Boudewijn Aug 26 '20 at 08:43
  • can you post a sample of your data, using `dput()`, since the code is working on the sample data in this answer... – Wimpel Aug 26 '20 at 08:44
  • Of course: dput(DT) = structure(list(rom = c("2", "1", "2", "2", "2", ....), power = c("2", "2", "2", "1", "1", ....), pain = c("2", "2", "3", "2", "2", ....)), row.names = c(NA, -57L), class = "data.frame") – Boudewijn Aug 26 '20 at 08:51
0

Maybe you can use paste to join the columns and then table and proportions to get the shares.

proportions(table(do.call(paste, DT))) * 100
#     1 1 2      1 2 2      1 2 3      2 1 2      2 2 2      2 2 3
# 5.263158 22.807018  8.771930  5.263158 22.807018 35.087719
GKi
  • 37,245
  • 2
  • 26
  • 48
0

Another option with count

library(dplyr)
DT %>% 
    count(!!! rlang::syms(names(.))) %>%
    transmute(prop = n/sum(n) * 100)
#       prop
#1  5.263158
#2 22.807018
#3  8.771930
#4  5.263158
#5 22.807018
#6 35.087719
akrun
  • 874,273
  • 37
  • 540
  • 662