Iteratively join dataset with the resulting dataset

Question

I am trying to find Circular Path, in a dataframe with 2 columns

Eg:

Col1 Col2 
A    B 
C    A 
B    D 
D    C

Table

So, A-B-D-C-A is a circular route

df <- sqldf("Select * from circuit as 'A' INNER JOIN circuit as 'B' ON A.'To'= B.'FROM'")
result <- df[df$`FROM`==df$`TO..4`,]

This gives me all the bidirectional routes, is there a way I can perform the join iteratively and find all possible circular routes?

I feel that your sample data is a bit too simplistic to be able to provide a robust and generalisable solution. For example, what is your expected output for more complex cases (e.g. in the case of ambiguous paths)? I would probably approach this from the point of graphs. `library(igraph); ig <- graph_from_data_frame(df)` will return an `igraph` object. You can then identify connected subgraphs with `clusters` and work from there; or use `decompose.graph` to split `ig` into a `list` of connected subgraphs. — Maurits Evers, Apr 25 '19 at 05:35

score 0 · Accepted Answer · answered Apr 25 '19 at 05:59

Further to my comment above, I think a good starting point to address your question would be to translate your structure into a graph.

df <- read.table(text =
    "Col1 Col2
    A B
    C A
    B D
    D C", header = T)

library(igraph)
ig <- graph_from_data_frame(df)

We can plot the graph

plot(ig)

I'm not entirely clear on what you're expected output is supposed to be, and as stated your sample data seems to be too simplistic to infer a more general solution. Having said that and in this particular case, you could extract all cycles of the graph, which correspond to all circular paths of your structure starting from any point/vertex (adapted from r igraph find all cycles)

cycles <- list()
for (v1 in V(ig)) {
    for (v2 in neighbors(ig, v1)) {
        cycles[length(cycles) + 1] = lapply(
                all_simple_paths(ig, v2, v1),
                function(p) c(v1, p))
    }
}
cycles
#[[1]]
#  B D C A
#1 3 4 2 1
#
#[[2]]
#  A B D C
#2 1 3 4 2
#
#[[3]]
#  D C A B
#3 4 2 1 3
#
#[[4]]
#  C A B D
#4 2 1 3 4

Your example graph contains four cycles; for example the first cycle in the list is B -> D -> C -> A -> B, the second cycle is A -> B -> D -> C -> A and so on.

If you have multiple disconnected cyclic subgraphs, I would decompose your graph into these components first (e.g. using decompose.graph), and then identify cycles per component.

Iteratively join dataset with the resulting dataset

1 Answers1