3

I have a data frame with the following format:

name workplace
a     A
b     B
c     A
d     C
e     D
....

I would like to convert this data frame into an affiliation network in R with the format

    A B C D ...
a   1 0 0 0
b   0 1 0 0
c   1 0 0 0
d   0 0 1 0
e   0 0 0 1
...

and I used the following program:

for (i in 1:nrow(A1)) {  
  a1[rownames(a1) == A1$name[i],
     colnames(a1) == A1$workplace[i]] <- 1
}

where A1 is the data frame, and a1 is the affiliation network. However, since I have a large data frame, the above program runs very slow. Is there an efficient way that avoids looping in data conversion?

Thank you very much!

2 Answers2

3

If your data called df just do:

as.data.frame.matrix(table(df))
#   A B C D
# a 1 0 0 0
# b 0 1 0 0
# c 1 0 0 0
# d 0 0 1 0
# e 0 0 0 1
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • 1
    This is really cool. BTW, I have a longitudinal network so after this data transformation the column dimensions are not the same across years, e.g. one year has 1000 columns and another year has 2000 columns. What I want to do is to make the column dimensions of all the matrices across years the same. How would you do that? Thanks a lot!!! – user2521514 Sep 11 '14 at 09:18
  • 1
    Can you provide an actual example? – David Arenburg Sep 11 '14 at 09:19
  • Sure. After I did your command as the following: for (i in 0:18) { a <- eval(parse(text=paste0("A",i))) assign(x=paste0("df", i), value = as.data.frame.matrix(table(a[,-3]))) } – user2521514 Sep 11 '14 at 09:56
  • No, I mean provide an actual data that this solution want work with correctly – David Arenburg Sep 11 '14 at 09:58
  • Sure. I got 19 matrices of the same row number but different column numbers. e.g. df1 is A B C a 1 0 0 b 0 1 0 c 1 0 0 d 0 0 1 and df2 is A B D a 1 0 0 b 0 1 0 c 1 0 0 d 0 0 0 How do I convert both df1 and df2 to matrices that have 4 columns and 4 rows such as A B C D a b c d – user2521514 Sep 11 '14 at 10:08
  • sorry for the matrix format. the row names are a b c d, and column names are A B C and A B D. – user2521514 Sep 11 '14 at 10:10
  • It's not so clear from your comment. Maybe ask a new question and link to hear – David Arenburg Sep 11 '14 at 10:36
  • Sorry that the previous comments were not clear. I posted a question here: http://stackoverflow.com/questions/25785605/paste-matrix-in-r. Hope this will clarify the issue. – user2521514 Sep 11 '14 at 10:55
0

May be this also helps:

 m1 <- model.matrix(~0+workplace, data=dat)
 dimnames(m1) <- lapply(dat, unique)
 as.data.frame(m1)
 #  A B C D
 #a 1 0 0 0
 #b 0 1 0 0
 #c 1 0 0 0
 #d 0 0 1 0
 #e 0 0 0 1
akrun
  • 874,273
  • 37
  • 540
  • 662