1

I am working with a retweets networks using igraph. My network is directed, meaning that it connects people that retweets from other users.

My format is an edgelist where arrows follow from the retweeter to the retweeted user and there are no connections among retweeters (that is, all retweeters have 0 inner degree as they don't retweet each other).

I would like to connect retweeters by common friends and simplify the network. To do so, I want to connect users by common retweeted users:

Consider the following repex:

edgelist <- read.table(text = "
                       A C
                       B C
                       D C")

g <- graph.data.frame(edgelist, directed = T)

In this case nodes A,B and E are retweeting from node C so I would like to connect all of them the following way:

result

Ideally, I would also have weights by the number of times they retweet from a user that I would like to incorporate to the final network but this might be another different question to tackle.

I have tried the following function and it does work in small toy networks but when I try it in mine (thousands of edges) it collapses:

connect_friends<-function(edgelist){
  g <- graph.data.frame(edgelist, directed = T)
  g <- delete_vertices( g, 
                        (!V(g) %in% c(V(g)[[degree(g, mode = "in")>=2]])) & 
                          (!V(g) %in% c(V(g)[[degree(g, mode = "in")==0]])))
  el <- as.data.frame(get.edgelist(g))
  ids <- unique(c(el$V1, el$V2))
  
  y <- lapply(ids, function(id) {
    
    x <- el[which(el$V1 == id | el$V2 == id),]
    alt_nodes <- setdiff(unique(c(x$V1, x$V2)), id)
    
  })
  
  if(length(y)==0) {
    stop("No common friends found")
  }
  ne2=NULL
  ne=NULL
  for (i in 1:length(y)) {
    new_edge <- y[[i]]
    if (length(new_edge)>=2){
      ne <- t(combn(new_edge,2))
    }
    ne2 <- rbind(ne,ne2)
  }
  g2  <<-  graph.data.frame(ne2, directed  =  F)
  
}


Is there a more efficient way of doing it?

Thanks a lot in advance!

Luis
  • 330
  • 1
  • 11

1 Answers1

1

Update

With the updated example, we can get

> gres
IGRAPH a824a46 UN-- 4 0 -- 
+ attr: name_1_1 (g/c), name_1_2 (g/c), name_2_1 (g/c), name_2_2 (g/c),
| loops_1_1 (g/l), loops_1_2 (g/l), loops_2_1 (g/l), loops_2_2 (g/l),
| name (v/c)
+ edges from a824a46 (vertex names):

and plot(gres) shows

enter image description here

You can use disjoint_union + split + make_full_graph like below

gres <- do.call(
  graph.union,
  lapply(
    names(V(g))[degree(g, mode = "out") == 0],
    function(x) {
      nbs <- names(V(g))[distances(g, v = x, mode = "in") == 1]
      disjoint_union(
        set_vertex_attr(make_full_graph(length(nbs)), name = "name", value = nbs),
        set_vertex_attr(make_full_graph(1), name = "name", value = x)
      )
    }
  )
)

which gives

> gres
IGRAPH cb11da8 UN-- 4 3 -- 
+ attr: name_1 (g/c), name_2 (g/c), loops_1 (g/l), loops_2 (g/l), name
| (v/c)
+ edges from cb11da8 (vertex names):
[1] A--B A--D B--D

and plot(gres) shows

enter image description here

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • Thanks a lot for the effort. The code does not appear to work when there are pairs of nodes, though. Consider the following example: `edgelist <- read.table(text = " A C B D") g <- graph.data.frame(edgelist, directed = T)` The output from `gres` connects A and B and C and D even though they share no common friend. – Luis May 25 '21 at 16:06
  • @Luis I updated my answer, now it works for your new example in the comment. – ThomasIsCoding May 25 '21 at 16:25
  • @ThomaslsCoding Thanks a lot for the update. I have tried it for the example and it does handle it however now I am facing trouble for triads. See this one `edgelist <- read.table(text = " A B A C")` It might be because there is no distinction for inner or outer degree? In that case B and C should not be connected as they are not the users retweeting (i.e. they have inner degre!=0) Sorry for bothering you again with this! – Luis May 25 '21 at 16:55
  • Thanks a lot!! I believe it does what I want know :) – Luis May 25 '21 at 20:31
  • @Luis You are welcome. Glad it helped your. – ThomasIsCoding May 25 '21 at 20:52
  • Sorry to go back to this. I have a simple question regarding vertex/node attributes. I am using your solution above where my initial igraph object is of the form `g <- graph_from_data_frame(joint_rt[, c("author.id" , "rt_user","id")])` . My problem is that I'd like the final output `gres` to preserve the edge attribute `id` , but the code does not store it. Do you have any idea on why this might be? – Luis Jun 19 '21 at 13:32
  • @Luis I guess you'd better create a new question post and ping me, then maybe I can help you with more details. – ThomasIsCoding Jun 19 '21 at 19:28