-1

I have been working on a dataset which is represented in the following way:

P1  P2  P3  P4  P5
0   2   1   0   1
0   1   0   0   0
0   0   0   3   0 
0   0   0   1   1
0   0   5   0   0
1   1   0   0   0

I am trying to convert it in to a row in R where there dummy variable is not 0, for example:

P2,P3,P5
P2
P4
P4,P5
P3
P1,P2

I tried the following: Recoding dummy variable to ordered factor , however, I am not getting multiple items. I am happy to generate a new transaction table which doesn't have any column names. I am hoping to run market basket analysis for the generated dataset.

Thanks

Community
  • 1
  • 1

2 Answers2

3

You can try:

apply(df,1,function(x) toString(names(df)[as.logical(x)]))
#[1] "P2, P3, P5" "P2"         "P4"         "P4, P5"     "P3"         "P1, P2" 

Data:

df = structure(list(P1 = c(0L, 0L, 0L, 0L, 0L, 1L), P2 = c(2L, 1L, 
0L, 0L, 0L, 1L), P3 = c(1L, 0L, 0L, 0L, 5L, 0L), P4 = c(0L, 0L, 
3L, 1L, 0L, 0L), P5 = c(1L, 0L, 0L, 1L, 0L, 0L)), .Names = c("P1", 
"P2", "P3", "P4", "P5"), class = "data.frame", row.names = c(NA, 
-6L))   
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87
2

Or

A <- matrix(c(0,1,0,1,
              2,0,0,3,
              1,2,1,5), nrow=3, ncol=4)
colnames(A) <- paste("P",1:4, sep = "")


apply(A, 1, function(x) { names(x[which(x!=0)]) })

which outputs a list:

[[1]]
[1] "P2" "P4"

[[2]]
[1] "P1" "P2" "P3" "P4"

[[3]]
[1] "P3" "P4"
Martin Schmelzer
  • 23,283
  • 6
  • 73
  • 98