Is there a way to require row values to correspond with eachother in R?

Question

So my question doesn't sound quite right, but i don't how to formulate it more correctly.

I am trying to do the following. I have a dataframe which contains 3 columns, 2 of which can either have a 1 (yes) or a 2 (no). And 1 column that has either a 1,2 or 3. Now what I want to do is see which percentage has all 2 column on 1 and the 1 column on 1 or 2, which has all 2 column on 2 and the one column on 3, and which only has one of the 2 column on 1 and the one column on 1 or 2.

If you catch my drift?

for example df is:

    V1 V2 V3 
1   1  2  1  
2   2  2  2  
3   1  1  1  
4   2  2  1  
5   1  2  1

Of the 5 rows 1 has all on 1 (yes), so 20% would be all 1, so i'd expect my outcome to be something like this:

all 1: 20%

Sample of data with dput(DT)

   structure(list(rom = c("2", "1", "2", "2", "2", "1", "2", "1", 
"2", "2", "1", "1", "2", "2", "2", "1", "2", "2", "2", "2", "2", 
"2", "2", "1", "2", "2", "2", "2", "1", "1", "2", "2", "2", "2", 
"2", "1", "1", "1", "1", "2", "2", "2", "1", "1", "2", "2", "1", 
"1", "2", "1", "1", "2", "2", "2", "2", "1", "1"), power = c("2", 
"2", "2", "1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "1", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "1", "2", "2", "2", "1", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2"), pain = c("2", "2", "3", "2", "2", "3", "2", 
"2", "2", "2", "3", "2", "3", "2", "3", "3", "3", "3", "3", "3", 
"2", "3", "3", "2", "3", "3", "2", "2", "2", "2", "3", "3", "3", 
"3", "2", "2", "3", "2", "2", "2", "3", "3", "2", "2", "3", "2", 
"2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "2")), row.names = c(NA, 
-57L), class = "data.frame")

score 0 · Answer 1 · answered Aug 26 '20 at 08:30

0

If you have only 1's and 2's in your data you can calculate row-wise sum using rowSums and calculate the ratio of rows that have sum equal to number of columns in the dataframe.

mean(rowSums(df) == ncol(df)) * 100
#[1] 20

data

df <- structure(list(V1 = c(1L, 2L, 1L, 2L, 1L), V2 = c(2L, 2L, 1L, 
2L, 2L), V3 = c(1L, 2L, 1L, 1L, 1L)), class = "data.frame", 
row.names = c(NA, -5L))

answered Aug 26 '20 at 08:30

Ronak Shah

377,200
20
156
213

Thanks! I also have one row where it can be either a 1 or a 2. Since that one is from 1 to 3 (1 yes+explanation, 2yes, 3no). Forgot to add that to my question, apologies – Boudewijn Aug 26 '20 at 08:37
If you want to exclude that column which has 1, 2 or 3 you can do it with `-` sign. Let's say column 5 is that column, you can do `mean(rowSums(df[-5]) == ncol(df) - 1) * 100` – Ronak Shah Aug 26 '20 at 08:46

score 0 · Answer 2 · answered Aug 26 '20 at 08:37

0

perhaps this is what you are looking for?

DT <- data.table::fread("V1 V2 V3 
   1  2  1  
   2  2  2  
   1  1  1  
   2  2  1  
   1  2  1  ")


table( rowSums( DT * DT == 1 ) )  / nrow(DT)


#   0   1   2   3 
# 0.2 0.2 0.4 0.2

answered Aug 26 '20 at 08:37

Wimpel

26,031
1
20
37

if you want to include other numbers (like 2), use `table( rowSums( DT * (DT == 1 | DT == 2) ) ) / nrow(DT)` – Wimpel Aug 26 '20 at 08:40
Thanks for your answer, however when i do this i get the following error code: "Error in FUN(left, right) : non-numeric argument to binary operator" – Boudewijn Aug 26 '20 at 08:43
can you post a sample of your data, using `dput()`, since the code is working on the sample data in this answer... – Wimpel Aug 26 '20 at 08:44
Of course: dput(DT) = structure(list(rom = c("2", "1", "2", "2", "2", ....), power = c("2", "2", "2", "1", "1", ....), pain = c("2", "2", "3", "2", "2", ....)), row.names = c(NA, -57L), class = "data.frame") – Boudewijn Aug 26 '20 at 08:51

GKi · Accepted Answer · 2020-08-26T09:01:05.720

0

Maybe you can use paste to join the columns and then table and proportions to get the shares.

proportions(table(do.call(paste, DT))) * 100
#     1 1 2      1 2 2      1 2 3      2 1 2      2 2 2      2 2 3
# 5.263158 22.807018  8.771930  5.263158 22.807018 35.087719

edited Aug 26 '20 at 09:01

answered Aug 26 '20 at 08:57

GKi

37,245
2
26
48

Thank you! That worked. And I assume those values *100 will give me the percentages? – Boudewijn Aug 26 '20 at 09:00
Yes. If have add this `*100`. – GKi Aug 26 '20 at 09:01

score 0 · Answer 4 · answered Aug 26 '20 at 20:56

0

Another option with count

library(dplyr)
DT %>% 
    count(!!! rlang::syms(names(.))) %>%
    transmute(prop = n/sum(n) * 100)
#       prop
#1  5.263158
#2 22.807018
#3  8.771930
#4  5.263158
#5 22.807018
#6 35.087719

answered Aug 26 '20 at 20:56

akrun

874,273
37
540
662

Is there a way to require row values to correspond with eachother in R?

4 Answers4