0

I have a dataset which looks like this:

data_original <- matrix(c("class1","class2","class3","class1","class2","class3","class1","class2","class3"),ncol=1,byrow=TRUE)
colnames(data_original) <- c("class")
rownames(data_original) <- c("student1","student2","student3","student1","student2","student3","student1","student2","student3")
data_original <- as.table(data_original)
data_original

         class 
student1 class1
student2 class2
student3 class3
student1 class1
student2 class2
student3 class3
student1 class1
student2 class2
student3 class3

I want it to looks like this:

data_req <- matrix(c(1,1,0,1,0,0,1,1,0),ncol=3,byrow=TRUE)
colnames(data_req) <- c("class1","class2","class3")
rownames(data_req) <- c("student1","student2","student3")
data_req <- as.table(data_req)
data_req

        class1 class2 class3
student1      1      1      0
student2      1      0      0
student3      1      1      0

Basically I want to convert the value in the class column which indicates which class a student is taking into a column of its own. Is there an R package that can do that?

jmich738
  • 1,565
  • 3
  • 24
  • 41
  • 4
    Your output does not match your input. – Rich Scriven Jan 15 '16 at 04:17
  • 2
    Is your input actually a `table`, or is it a data.frame? – David Robinson Jan 15 '16 at 04:18
  • 1
    *Very* close to a duplicate of http://stackoverflow.com/q/11659128/496803 – thelatemail Jan 15 '16 at 04:36
  • It seems the original data to be transformed is not correct. It seems that either student and class should be in sequence "1, 2, 3" and "1, 1, 1" where the data above is "1, 2, 3" and "1, 2, 3" – steveb Jan 15 '16 at 04:42
  • sorry, yes the output does not match, I was tying to provide an example of the type of output I wanted. I could not think of the name for table. @thelatemail - yes mine looks like a duplicate. I could not think of a name to search for, i guess binary table is a good name. – jmich738 Jan 15 '16 at 05:13

3 Answers3

2

Given the desired output, it seems that the input should be something as follows (as a data frame)

data_original <- structure(list(student = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L), .Label = c("student1", "student2", "student3"), class = "factor"), 
    class = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("class1", 
    "class2", "class3"), class = "factor"), val = c(1, 1, 1, 
    1, 0, 1, 0, 0, 0)), .Names = c("student", "class", "val"), row.names = c(NA, 
-9L), class = "data.frame")

In a more readable form

   student  class val
1 student1 class1   1
2 student2 class1   1
3 student3 class1   1
4 student1 class2   1
5 student2 class2   0
6 student3 class2   1
7 student1 class3   0
8 student2 class3   0
9 student3 class3   0

A tidyr solution would be as follows

library(dplyr)
library(tidyr)

data_original %>% spread(class, val)
steveb
  • 5,382
  • 2
  • 27
  • 36
1

I think it will work easier if you you can transform your data to a data frame.

df <- data.frame(student=rownames(data_original), class=data_original[,1])

Then you can just use

library(reshape2)
dcast(unique(df), student ~ class, length, value.var="class")
Carlos Alberto
  • 598
  • 3
  • 9
1

We can use xtabs

xtabs(val~student+class, data_original)
#             class
#student    class1 class2 class3
#  student1      1      1      0
#  student2      1      0      0
#  student3      1      1      0
akrun
  • 874,273
  • 37
  • 540
  • 662