1

I want to detect a cycle in a data.frame which looks like this:

DF:

Item  City1   City2
A     Delhi   Mumbai
A     Mumbai  Chennai
A     Mumbai  Delhi
B     Delhi   Chennai
B     Chennai Goa

I want to keep cities which are having a cyclic relation like here for Item A:

Delhi-> Mumbai -> Delhi

Desired output:

Item  City1   City2
A     Delhi   Mumbai
A     Mumbai  Delhi

Can someone help me how to detect this?

Cettt
  • 11,460
  • 7
  • 35
  • 58
Anshul S
  • 281
  • 1
  • 5
  • 1
    If you care about cycles of *more* than 2 cities, treat it as a graph theory problem of finding cycles and use the `igraph` package. [Here is one example](https://stackoverflow.com/a/55094319/903061), there are many others on the site. – Gregor Thomas Oct 17 '19 at 13:30

2 Answers2

3

Here is a possible solution using the dplyr package:

df <- data.frame(Item = c("A", "A", "A", "B", "B"),
             City1 = c("Delhi", "Mumbai", "Mumbai", "Delhi", "Chennai"),
             City2 = c("Mumbai", "Chennai", "Delhi", "Chennai", "Goa"))

library(dplyr)

df %>% group_by(Item) %>%
  filter(paste0(City1, City2) %in% paste0(City2, City1)) %>%
  ungroup()

# A tibble: 2 x 3
# Groups:   Item [1]
  Item  City1  City2 
  <chr> <chr>  <chr> 
1 A     Delhi  Mumbai
2 A     Mumbai Delhi
Cettt
  • 11,460
  • 7
  • 35
  • 58
  • 1
    I don't think you need to group – Sotos Oct 17 '19 at 13:28
  • Hm, not sure. Thought that the `Item` column was meant as a grouping variable. But I am not sure. – Cettt Oct 17 '19 at 13:30
  • You are still pasting 2 columns together. Whether you group or not, `paste(City1, City2)` is always going to be the same – Sotos Oct 17 '19 at 13:32
  • 2
    yes but it could be possible to find a match in another group. For example if we add a row `Item = "C", City1 = "Delhi", City2 = "Mumbai"` then without grouping this row would be returned as well. – Cettt Oct 17 '19 at 13:38
2

You can join the dataframe with itself on the condition that for rows from the second copy of df these are true:

Items match, City1 = City2 and City2 = City1

merge(df, df, 
      by.x = c('Item', paste0('City', 1:2)), 
      by.y = c('Item', paste0('City', 2:1)))

#    Item  City1  City2
# 1:    A  Delhi Mumbai
# 2:    A Mumbai  Delhi
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38