0

I have a data frame which is as below :

h = data.frame(fr  = c('A','A','X','E','B','W','C','Y'),
               t   = c('B','E','Y','C','A','X','A','W'),
               Amt = c( 40, 30, 55, 10, 33, 78, 21, 90))

I've found all the possible vertex sequence that starts with the smallest vertex number by using r igraph find all cycles as reference. And the result is as below :

[[1]]
A E C A 
1 3 6 1 

[[2]]
A B A 
1 4 1 

[[3]]
X Y W X 
2 7 5 2 

Now I'd like to

  • calculate the sum from each cycles

  • numbers of edges in each cycles

It'd be like this :

A - B - A : 40 + 33 = 73 ; numbers of edges : 2

A - E - C - A : 30 + 10 + 21 = 61 ; numbers of edges : 3

X - Y - W - X : 55 + 90 + 78 = 223 ; numbers of edges : 3

Does anyone have any ideas to use R to calculate? That would be great appreciation !!


FURTHER EDIT PART

Thanks to the reply, I can calculate two items above !! However, I got a tiny problem here !!

I don't know what the problem I got so that I cannot calculate correctly !! Even I modify many times.

It should be like this :

[[1]]        [[2]]     [[3]]
A E C A      A B A     X Y W X

  Path            sumAmt numberOfEdges
  <fct>            <dbl>         <int>
1 "A - B - A"         73             2
2 "A - E - C - A"     61             3
3 "X - Y - W - X"    223             3

But After I put in my code, it cannot show up the first node :

[[1]]        [[2]]     [[3]]
  E C A        B A       Y W X

  Path            sumAmt numberOfEdges
  <fct>            <dbl>         <int>
1 " - B - A"         33             2
2 " - E - C - A"     31             3
3 " - Y - W - X"    168             3

Here's my code on finding cycles. Does anything I miss to put-in ??

h = data.frame(fr  = c('A','A','X','E','B','W','C','Y'),
               t   = c('B','E','Y','C','A','X','A','W'),
               Amt = c( 40, 30, 55, 10, 33, 78, 21, 90))

library(igraph)
g <- graph.data.frame(h, directed = TRUE)

Cycles = NULL
for(fr in V(g)) {
  for(t in neighbors(g, fr, mode = "out")) {
    Cycles = c(Cycles, 
    lapply(all_simple_paths(g, t, fr, mode = "out"), function(p)c(fr,p)))
  }
}

LongCycles = Cycles[which(sapply(Cycles, length) > 1)]
LongCycles[sapply(LongCycles, min) == sapply(LongCycles, `[`, 1)]

Does anyone have ideas? That would be helpful !!

  • Is this any different from your previous question? https://stackoverflow.com/questions/59997320/find-all-existed-cycles-from-data-in-r – Ronak Shah Feb 01 '20 at 12:36
  • @RonakShah Yes, it's different. The prior one I didn't know how to write codes to find any cycles. After take https://stackoverflow.com/questions/55091438/r-igraph-find-all-cycles?noredirect=1&lq=1 as reference, I've already use code to find cycles. But I'd like to calculate the amount of each cycles as above. This one is the extension one based on the previous. – Chen Hobbit Feb 01 '20 at 12:43

1 Answers1

1

There's probably a shorter way, but provided your data is as follows (where h is your table with amounts, and all_cycles list with cycles) -

h = data.frame(fr  = c('A','A','X','E','B','W','C','Y'),
               t   = c('B','E','Y','C','A','X','A','W'),
               Amt = c( 40, 30, 55, 10, 33, 78, 21, 90))

all_cycles <- list(
  c(A = 1, E = 3, C = 6, A = 1),
  c(A = 1, B = 4, A = 1),
  c(X = 2, Y = 7, W = 5, X = 2)
)

.. you could do:

library(dplyr)

data.frame(
  Nodes = unlist(lapply(all_cycles, names)),
  Path = unlist(lapply(seq_along(all_cycles), 
                       function(x) rep(paste(names(all_cycles[[x]]), collapse = " - "), 
                                       length(all_cycles[[x]]))))
  ) %>%
  group_by(Path) %>%
  mutate(fr = Nodes, t = lead(Nodes)) %>%
  left_join(h) %>%
  summarise(sumAmt = sum(Amt, na.rm = TRUE), numberOfEdges = sum(!is.na(t)))

To get:

# A tibble: 3 x 3
  Path          sumAmt numberOfEdges
  <fct>          <dbl>         <int>
1 A - B - A         73             2
2 A - E - C - A     61             3
3 X - Y - W - X    223             3

In case first value is always unnamed in the elements of your list, you could do:

data.frame(
  Nodes = unlist(lapply(all_cycles, names)),
  id = unlist(lapply(seq_along(all_cycles), 
                       function(x) rep(x, length(all_cycles[[x]])))), stringsAsFactors = FALSE
  ) %>%
  group_by(id) %>% mutate(Nodes = replace(Nodes, Nodes == "", last(Nodes)),
                          Path = paste(Nodes, collapse = " - ")) %>%
  mutate(fr = Nodes, t = lead(Nodes)) %>%
  group_by(Path, id) %>%
  left_join(h) %>%
  summarise(sumAmt = sum(Amt, na.rm = TRUE), numberOfEdges = sum(!is.na(t)))
arg0naut91
  • 14,574
  • 2
  • 17
  • 38
  • Thanks a lot !! But I've a little problem !! – Chen Hobbit Feb 01 '20 at 15:56
  • But I've a little problem !! I don't know why it cannot show the first node so that I can't calculate the sum accurately. It looks as below : *should be like : A - B - A ; A - E - C - A *the result in my code : " - B - A" ; " - E - C - A" Do you know any reason regarding this ?? – Chen Hobbit Feb 01 '20 at 16:05
  • @ChenHobbit this is because first value always seems to be unnamed. I've posted a workaround. – arg0naut91 Feb 01 '20 at 18:01