-1

I am working in R and i need on below problem. I have my data in below format.

Users   Lang_1  Lang_2  Lang_3  Lang_4  Lang_5
user_1  C       SAS     Python  SPSS    Java
user_2  R       C++     Java
user_3  SAS     R       Python  Octave
user_4  iPython SQL     R
user_5  SQL     Java    Dot Net Python

and need my output to be in below format

Users   C   R   SAS   iPython   SQL   C++   Java   Python   DotNet   SPSS   Octave
user_1  1   0   1       0       0     0     1       1       0        1      0
user_2  0   1   0       0       0     1     1       0      0         0      0
user_3  0   1   1       0       0     0     0       1      0         0      1
user_4  0   1   0       1       1     0     0       0      0         0      0
user_5  0   0   0       0       1     0     1       1      1         0      0

Trying to use above info for classification need.Please help me out.

  • 1
    read document of `melt` and `cast` plz. – Ping Jin Aug 21 '15 at 15:01
  • To the others: Isn't there an English word for his "desired output"? It always seems to me like the real problem is that people don't know what to search for... – maj Aug 21 '15 at 15:18
  • You can start looking into it by checking those 2 links: a) http://www.statmethods.net/management/reshape.html , b) http://www.r-bloggers.com/introducing-tidyr/ – AntoniosK Aug 21 '15 at 15:42
  • I am clueless as to how to go about it.Thanks for you suggestions. – himanshu tripathi Aug 22 '15 at 17:42

1 Answers1

1
library(reshape)

#read the problem data-frame

data <- read.csv(file.choose())

#pass the index of id variable

data_m <- melt(data,id.vars = 1)

#remove the observations where value column in blank

data_m <- data_m[-which(data_m$value==""),]

# deleted variable column

data_m <- data_m[,-2]

#desired output by running below command

cast(data_m,Users~value,length)