0

I am studying social network analysis and will be using Ucinet to draw network graphs. For this, I have to convert the csv file to an edge list format. Converting the adjacency matrix to the edge list was successful. However, it is difficult to convert an incidence matrix to the edge list format.

  1. The csv file('some.csv') I have, with a incidence matrix like this:

        A  B  C  D
    a   1  0  3  1
    b   0  0  0  2
    c   3  2  0  1
    
  2. The code that converted the adjacency matrix to the edge list was as follows:

    x<-read.csv("C:/.../something.csv", header=T, row.names=1)
    net<-as.network(x, matrix.type='adjacency', ignore.eval=FALSE, names.eval='dd', loops=FALSE)
    el<-edgelist(net, attrname='dd')
    write.csv(el, file='C:/.../result.csv')
    
  3. Now It only succeedded in loading the file. I tried to follow the above method, but I get an error.

    y<-read.csv("C:/.../some.csv", header=T, row.names=1)
    net2<-network(y, matrix.type='incidence', ignore.eval=FALSE, names.eval='co', loops=FALSE)
    Error in network.incidence(x, g, ignore.eval, names.eval, na.rm, edge.check) : 
    

    Supplied incidence matrix has empty head/tail lists. (Did you get the directedness right?)

  4. I want to see the result in this way:

    a A 1
    a C 3
    a D 1
    b D 2
    c A 3
    c B 2
    c D 1
    

    I tried to put the values as the error said, but I could not get the result i wanted. Thank you for any assistance with this.

C.H.
  • 1
  • 1

2 Answers2

1

Here's your data:

inc_mat <- matrix(
  c(1, 0, 3, 1,
    0, 0, 0, 2,
    3, 2, 0, 1),
  nrow = 3, ncol = 4, byrow = TRUE
  )
rownames(inc_mat) <- letters[1:3]
colnames(inc_mat) <- LETTERS[1:4]

inc_mat
#>   A B C D
#> a 1 0 3 1
#> b 0 0 0 2
#> c 3 2 0 1

Here's a generalized function that does the trick:

as_edgelist.weighted_incidence_matrix <- function(x, drop_rownames = TRUE) {
  melted <- do.call(cbind, lapply(list(row(x), col(x), x), as.vector)) # 3 col matrix of row index, col index, and `x`'s values

  filtered <- melted[melted[, 3] != 0, ] # drop rows where column 3 is 0

                                                       # data frame where first 2 columns are...
  df <- data.frame(mode1 = rownames(x)[filtered[, 1]], # `x`'s rownames, indexed by first column in `filtered``
                   mode2 = colnames(x)[filtered[, 2]], # `x`'s colnames, indexed by the second column in `filtered`
                   weight = filtered[, 3],             # the third column in `filtered`
                   stringsAsFactors = FALSE)

  out <- df[order(df$mode1), ] # sort by first column

  if (!drop_rownames) {
    return(out)
  }
  `rownames<-`(out, NULL)
}

Take it for a spin:

el <- as_edgelist.weighted_incidence_matrix(inc_mat)
el
#>   mode1 mode2 weight
#> 1     a     A      1
#> 2     a     C      3
#> 3     a     D      1
#> 4     b     D      2
#> 5     c     A      3
#> 6     c     B      2
#> 7     c     D      1

Here are the results you wanted:

control_df <- data.frame(
  mode1 = c("a", "a", "a", "b", "c", "c", "c"),
  mode2 = c("A", "C", "D", "D", "A", "B", "D"),
  weight = c(1, 3, 1, 2, 3, 2, 1),
  stringsAsFactors = FALSE
  )

control_df
#>   mode1 mode2 weight
#> 1     a     A      1
#> 2     a     C      3
#> 3     a     D      1
#> 4     b     D      2
#> 5     c     A      3
#> 6     c     B      2
#> 7     c     D      1

Do they match?

identical(control_df, el)
#> [1] TRUE
knapply
  • 647
  • 1
  • 5
  • 11
0

This might not be the most efficient way, but it produces expected result:

y <- matrix( c(1,0,3,0,0,2,3,0,0,1,2,1), nrow=3)
colnames(y) <- c("e.A","e.B","e.C","e.D")

dt <- data.frame(rnames=c("a","b","c"))
dt <- cbind(dt, y)
#  rnames e.A e.B e.C e.D
#1      a   1   0   3   1
#2      b   0   0   0   2
#3      c   3   2   0   1

# use reshape () function to convert dataframe into the long format
M <- reshape(dt, direction="long", idvar = "rnames", varying = c("e.A","e.B","e.C","e.D"))
M <- M[M$e >0,]
M
#     rnames time e
# a.A      a    A 1
# c.A      c    A 3
# c.B      c    B 2
# a.C      a    C 3
# a.D      a    D 1
# b.D      b    D 2
# c.D      c    D 1


# If M needs to be sorted by the column rnames:
M[order(M$rnames), ]
#     rnames time e
# a.A      a    A 1
# a.C      a    C 3
# a.D      a    D 1
# b.D      b    D 2
# c.A      c    A 3
# c.B      c    B 2
# c.D      c    D 1
Katia
  • 3,784
  • 1
  • 14
  • 27