0

I am working on a survey data. It asks several matrix questions about level of satisfaction of respondents about several items. Below is an example.

q1: how much satisfied are you with the item A? "very satisfied" "somewhat satisfied" somewhat dissatisfied" very dissatisfied"

q2: how much satisfied are you with the item B? "very satisfied" "somewhat satisfied" somewhat dissatisfied" very dissatisfied"

q3: how much satisfied are you with the item C? "very satisfied" "somewhat satisfied" somewhat dissatisfied" very dissatisfied"

q4: how much satisfied are you with the item D? "very satisfied" "somewhat satisfied" somewhat dissatisfied" very dissatisfied"

The data look as below:

df <- data.frame(q1 = c("Very satisfied", "Somewhat satisfied", "Very satisfied", "Very satisfied", "Somewhat satisfied", "Very dissatisfied", "Very satisfied", "Very dissatisfied", "Very dissatisfied", "Somewhat dissatisfied"),
                 q2 = c("Somewhat satisfied", "Very satisfied", "Somewhat satisfied", "Very satisfied", "Very satisfied", "Somewhat dissatisfied", "Somewhat dissatisfied", "Somewhat dissatisfied", "Very dissatisfied", "Very dissatisfied"),
                 q3 = c("Very satisfied", "Somewhat satisfied", "Very satisfied", "Very satisfied", "Somewhat satisfied", "Very dissatisfied", "Very satisfied", "Very dissatisfied", "Very dissatisfied", "Somewhat dissatisfied"),
                 q4 = c("Somewhat satisfied", "Very satisfied", "Somewhat satisfied", "Very satisfied", "Very satisfied", "Somewhat dissatisfied", "Somewhat dissatisfied", "Somewhat dissatisfied", "Very dissatisfied", "Very dissatisfied"))

                      q1                    q2                    q3                    q4
1         Very satisfied    Somewhat satisfied        Very satisfied    Somewhat satisfied
2     Somewhat satisfied        Very satisfied    Somewhat satisfied        Very satisfied
3         Very satisfied    Somewhat satisfied        Very satisfied    Somewhat satisfied
4         Very satisfied        Very satisfied        Very satisfied        Very satisfied
5     Somewhat satisfied        Very satisfied    Somewhat satisfied        Very satisfied
6      Very dissatisfied Somewhat dissatisfied     Very dissatisfied Somewhat dissatisfied
7         Very satisfied Somewhat dissatisfied        Very satisfied Somewhat dissatisfied
8      Very dissatisfied Somewhat dissatisfied     Very dissatisfied Somewhat dissatisfied
9      Very dissatisfied     Very dissatisfied     Very dissatisfied     Very dissatisfied
10 Somewhat dissatisfied     Very dissatisfied Somewhat dissatisfied     Very dissatisfied

I am supposed to findout all those observations with the following pattern:

case1

if q1 = "Very satisfied" and q2 = "somewhat satisfied" and q3 = "very satisfied" and q4 = "somewhat satisfied"

case2

or q1 = "Very satisfied" and q2 = "somewhat dissatisfied" and q3 = "very satisfied" and q4 = "somewhat dissatisfied"

case3

or q1 = "Very satisfied" and q2 = "very dissatisfied" and q3 = "very satisfied" and q4 = "very dissatisfied"

I can find this pattern using below command. However, since I have to do this for several matrices and number of questions in each matrix varies, I wonder if anyone knows an easy way of doing this.


df %>%
  mutate(case1 = ifelse((q1 %in% "Very satisfied" & q2 %in% "Somewhat satisfied" & q3 %in% "Very satisfied" & q4 %in% "Somewhat satisfied"), TRUE, FALSE),
         case2 = ifelse((q1 %in% "Very satisfied" & q2 %in% "Somewhat dissatisfied" & q3 %in% "Very satisfied" & q4 %in% "Somewhat dissatisfied"), TRUE, FALSE),
         case3 = ifelse((q1 %in% "Very satisfied" & q2 %in% "Very dissatisfied" & q3 %in% "Very satisfied" & q4 %in% "Very dissatisfied"), TRUE, FALSE),
         zigzag = ifelse((case1 %in% TRUE | case2 %in% TRUE | case3 %in% TRUE), 1, 0)
         )


                      q1                    q2                    q3                    q4 case1 case2 case3 zigzag
1         Very satisfied    Somewhat satisfied        Very satisfied    Somewhat satisfied  TRUE FALSE FALSE      1
2     Somewhat satisfied        Very satisfied    Somewhat satisfied        Very satisfied FALSE FALSE FALSE      0
3         Very satisfied    Somewhat satisfied        Very satisfied    Somewhat satisfied  TRUE FALSE FALSE      1
4         Very satisfied        Very satisfied        Very satisfied        Very satisfied FALSE FALSE FALSE      0
5     Somewhat satisfied        Very satisfied    Somewhat satisfied        Very satisfied FALSE FALSE FALSE      0
6      Very dissatisfied Somewhat dissatisfied     Very dissatisfied Somewhat dissatisfied FALSE FALSE FALSE      0
7         Very satisfied Somewhat dissatisfied        Very satisfied Somewhat dissatisfied FALSE  TRUE FALSE      1
8      Very dissatisfied Somewhat dissatisfied     Very dissatisfied Somewhat dissatisfied FALSE FALSE FALSE      0
9      Very dissatisfied     Very dissatisfied     Very dissatisfied     Very dissatisfied FALSE FALSE FALSE      0
10 Somewhat dissatisfied     Very dissatisfied Somewhat dissatisfied     Very dissatisfied FALSE FALSE FALSE      0


** Thank you in advance! **

  • 1
    Have you done any searching on selection of items based on logical conditions in multiple columns? I'm pretty sure this has been asked and answered. – IRTFM Feb 01 '20 at 20:26
  • 1
    I would also ask you to not cross-post to twitter or other places at the same time---that is mostly frowned upon. If you ask here, trust that enough eyeballs will get to it. – Dirk Eddelbuettel Feb 01 '20 at 20:44
  • 1
    FYI, `ifelse(cond, TRUE, FALSE)` is *identical* to `cond` (without `ifelse`). – r2evans Feb 01 '20 at 20:44
  • Yes, I searched about it but none of previous solutions worked. – Fahim Ahmad Feb 01 '20 at 21:01
  • You should post your full coding efforts and links to the posts that you found. We cannot know what errors you are making. If all you need to do is "find" the rows where a pattern exists then: `df[with(df, q1 %in% "Very satisfied" & q2 %in% "Somewhat satisfied" & q3 %in% "Very satisfied" & q4 %in% "Somewhat satisfied"), ]` – IRTFM Feb 01 '20 at 21:20
  • Okay, in the future I will follow what you suggested. Yes you got it correctly, I want to flag those rows which follow a specific pattern, like the one above, but it will be too time consuming to apply the same syntax to several matrices, particularly when number of questions in a single matrix reaches up to 20 sometimes . I am looking if there is an easy way of doing it. – Fahim Ahmad Feb 01 '20 at 21:30

1 Answers1

0

For the example you offered since the conditions for q1 and q3 are all the same in the case-wise testing, you can get your zigzag result with just:

df[with(df, q1 == "Very satisfied" & 
            q2 == q4 & 
            q3 == "Very satisfied" & 
     q4 %in% c( "Very dissatisfied", "Somewhat dissatisfied", "Somewhat satisfied") ), ]

r2evans has already pointed out the redundancy of using ifelse. Had you wanted a numeric value for the zigzag result you could have more compactly used just:

zigzag = as.numeric( case1 | case2 | case3 ) # since 1 == TRUE
IRTFM
  • 258,963
  • 21
  • 364
  • 487