R: Convert binary categorical variables to long data format

Question

mydata <- structure(list(id = 1:10, cafe = c(0, 1, 0, 0, 1, 1, 0, 0, 1, 
1), playground = c(1, 1, 1, 1, 1, 1, 0, 1, 1, 0), classroom = c(0, 
0, 0, 0, 0, 1, 1, 1, 1, 1), gender = structure(c(2L, 2L, 2L, 
2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("Female", "Male"), class = "factor"), 
    job = structure(c(2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L), .Label = c("Student", 
    "Teacher"), class = "factor")), .Names = c("id", "cafe", 
"playground", "classroom", "gender", "job"), row.names = c(NA, 
-10L), class = "data.frame")

> mydata
   id cafe playground classroom gender     job
1   1    0          1         0   Male Teacher
2   2    1          1         0   Male Student
3   3    0          1         0   Male Teacher
4   4    0          1         0   Male Student
5   5    1          1         0   Male Teacher
6   6    1          1         1   Male Teacher
7   7    0          0         1 Female Teacher
8   8    0          1         1   Male Teacher
9   9    1          1         1 Female Teacher
10 10    1          0         1   Male Student

My desired long format data set should look like:

id      response    gender        job
1     playground      Male    Teacher
2           cafe      Male    Student
2     playground      Male    Student
3     playground      Male    Teacher
...

Essentially, the response column corresponds to which of the cafe, playground, and classroom columns have a value of 1. I've looked into several examples here and here, but they do not deal with binary data columns.

score 3 · Accepted Answer · answered Jun 04 '17 at 19:06

3

We can use do this with tidyverse

library(tidyverse)
mydata %>%
    gather(response, value, cafe:classroom) %>% 
    filter(value==1) %>%
    select(id, response, gender, job)

answered Jun 04 '17 at 19:06

akrun

874,273
37
540
662

robbertjan94 · Answer 2 · 2017-06-04T19:22:12.960

This can be done by using the melt(data, ...) function from the reshape package.

library(reshape)

First, we assign the variables that we want to keep as columns.

id <- c("id", "gender", "job")

Then, we change the wide format to long format and keep only the rows that contain a 1.

df <- melt(mydata, id=id)
df[df[,5]==1,-5]

Then, order the data by id.

df <- df[order(df[,"id"]),]

Finally, we change the column name and rearrange the columns.

colnames(df)[4] <- "response"
df <- df[,c(1,4,2,3)]

## id   response  gender    job
## 1  playground   Male Teacher
## 2        cafe   Male Student
## 2  playground   Male Student
## 3  playground   Male Teacher
## ...
## ...
## 9   classroom Female Teacher
## 10       cafe   Male Student
## 10  classroom   Male Student

R: Convert binary categorical variables to long data format

2 Answers2