I have a dataset:
data$a <- c(1,0,0,1,0)
data$b <- c(0,1,1,0,0)
data$c <- c(0,0,0,0,1)
How would I turn this into a single catergorical column that looks like this:
data$tranformed <- c(A,B,B,A,C)
You could do this:
w <- which(data==1, arr.ind = T)
data$tranformed <- toupper(names(data)[w[order(w[,1]),2]])
# a b c tranformed
#1 1 0 0 A
#2 0 1 0 B
#3 0 1 0 B
#4 1 0 0 A
#5 0 0 1 C
Better to do in this way since it works with column names and the letters are not hard-coded. If you change the column names, you will see the changes accordingly.
You could even do it in a better way:
data$tranformed <- toupper(names(data)[max.col(data)])
In case its allowed for data
to have rows without any 1
like this:
# a b c
#1 1 0 0
#2 0 1 0
#3 0 0 0
#4 1 0 0
#5 0 0 1
data <- structure(list(a = c(1, 0, 0, 1, 0), b = c(0, 1, 0, 0, 0), c = c(0,
0, 0, 0, 1)), .Names = c("a", "b", "c"), row.names = c(NA, -5L
), class = "data.frame")
You could do this:
inds <- which(rowSums(data)==0)
data$tranformed <- toupper(names(data)[max.col(data)])
data$tranformed[inds] <- NA
Which will give you:
# a b c tranformed
#1 1 0 0 A
#2 0 1 0 B
#3 0 0 0 <NA>
#4 1 0 0 A
#5 0 0 1 C
data$transformed<-factor(apply(data, 1, function(x) which(x == 1)),labels = colnames(data))
or (letters for lowercase)
factor(LETTERS[apply(data, 1, function(x) which(x == 1))])
EDIT: In case there is a row with only 0's like in the following example for the 3rd row.
df=data.frame(a =c(1,0,0,1,0),
b=c(0,1,0,0,0),
c =c(0,0,0,0,1)
)
a b c
1 1 0 0
2 0 1 0
3 0 0 0
4 1 0 0
5 0 0 1
You can't use the solutions above as the apply function will output a list of 0 length.
A workaround:
LETTERS[unlist(ifelse(sapply(apply(df, 1, function(x) which(x == 1)),length)==1,apply(df, 1, function(x) which(x == 1)),NA))]
[1] "A" "B" NA "A" "C"