How to create a new variable that shows different combinations of 4 dummy variables?

Question

I have 4 dummy variables taking values 0 or 1 corresponding to the adoption or not of a certain technology. The data frame has over 14000 rows.

I want to loop over these 4 columns to give me the different combinations of == 1 into a new variable.

Data

structure(list(tech1 = structure(c(2L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), tech2 = structure(c(2L, 2L, 2L, 2L), .Label = c("0", "1"), class = "factor"), tech3 = structure(c(1L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"), tech4 = structure(c(1L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor")), row.names = c(NA, 4L), class = "data.frame")

As different combinations are possible, my new variable should contain the information of which technologies each row indicates, that is, of the 4 technologies, which ones were adopted in each case.

Here is how the four first rows of the new variable could look like at the end (supposing "12" = adopted technologies 1 and 2 and so on):

Variable "Tech":

structure(list(Tech = structure(c(1L, 2L, 3L, 4L), .Label = c("12", "2", "234", "2"), class = "factor")),row.names = c(NA, 4L), class = "data.frame")

I have seen some functions that could work (e.g. aggregate), but I haven't found a solution so far.

Thank you, @RonakShah. I have edited the question. – lorlu Apr 14 '21 at 13:35 — lorlu, Apr 14 '21 at 13:35

score 0 · Answer 1 · answered Apr 14 '21 at 12:42

0

Without knowing what your desired end state is, with the apply function you can generate a list by row of the 1's in each column and a list by column of the 1's in each row.

m <- matrix(sample(0:1, 100, replace = TRUE), ncol = 4)
rows <- apply(m, 1, function(x) which(x == 1))
cols <- apply(m, 2, function(x) which(x == 1))

answered Apr 14 '21 at 12:42

SteveM

2,226
3
12
16

Thanks for your reply @SteveM. I have edited the question following the suggestion I received in the comment, I think it is easier to understand now. – lorlu Apr 14 '21 at 13:58

score 0 · Accepted Answer · answered Apr 14 '21 at 14:32

following on from SteveM:

data.frame(tech=apply(df, 1, function(x) paste(which(x==1), collapse="")))
  tech
#1   12
#2    2
#3  234
#4    2

Or a tidyverse method:

 df %>% 
    mutate(id=row_number()) %>% 
    pivot_longer(tech1:tech4) %>% 
    filter(value==1) %>% 
    group_by(id) %>% 
    summarise(Tech=paste(gsub("tech", "", name), collapse = ""))

# A tibble: 4 x 2
#     id Tech
#  <int> <chr>
#1     1 12   
#2     2 2    
#3     3 234  
#4     4 2

The 1 line of code you suggested first worked perfectly, thank you! — lorlu, Apr 14 '21 at 14:58

score 0 · Answer 3 · answered Apr 14 '21 at 14:33

library(tidyverse)

(df <- tribble(
  ~dum1, ~dum2, ~dum3, ~dum4, ~value,
      F,     T,     F,     T,     12,
      T,     T,     F,     F,     20,
      F,     T,     F,     T,     32,
      T,     F,     T,     F ,    27))

(
  df
  %>%  mutate(dum1 = ifelse(dum1, "1", ""),
              dum2 = ifelse(dum2, "2", ""),
              dum3 = ifelse(dum3, "3", ""),
              dum4 = ifelse(dum4, "4", ""),
              which_tech = paste0(dum1, dum2, dum3, dum4))
)

Output:

# A tibble: 4 x 6
  dum1  dum2  dum3  dum4  value which_tech
  <chr> <chr> <chr> <chr> <dbl> <chr>     
1 ""    "2"   ""    "4"      12 24        
2 "1"   "2"   ""    ""       20 12        
3 ""    "2"   ""    "4"      32 24        
4 "1"   ""    "3"   ""       27 13

How to create a new variable that shows different combinations of 4 dummy variables?

Data

3 Answers3

Output: