I have co-occurrence data that can be represented in two columns. The entries in each column are from the same set of possibilities. Ultimately I am aiming to plot a directed network but first I would like to split the table into those that reciprocal (i.e. both X->Y and Y->X) and those that occur in just one direction (i.e. only Y->Z). Here is an example:
library(tidyverse)
# Example data
from <- c("A", "B", "F", "Q", "T", "S", "D", "E", "A", "T", "F")
to <- c("E", "D", "Q", "S", "F", "T", "B", "A", "D", "A", "E")
df <- data_frame(from, to)
df
# A tibble: 11 x 2
from to
<chr> <chr>
1 A E
2 B D
3 F Q
4 Q S
5 T F
6 S T
7 D B
8 E A
9 A D
10 T A
11 F E
and here is my desired output:
# Desired output 1 - reciprocal co-occurrences
df %>%
slice(c(1,2)) %>%
rename(item1 = from, item2 = to)
# A tibble: 2 x 2
item1 item2
<chr> <chr>
1 A E
2 B D
# Desired output 2 - single occurrences
df %>%
slice(c(3,4,6,6,9,10,11))
# A tibble: 7 x 2
from to
<chr> <chr>
1 F Q
2 Q S
3 S T
4 S T
5 A D
6 T A
7 F E
If the co-occurrences are reciprocal it does not matter what order the entries are in I only need their names co-occurrences are not I need to know the direction.
This feels like a graph problem so I have had a go but am unfamiliar with working with this type of data and most tutorials seem to cover undirected graphs. Looking at the tidygraph
package which I understand uses the igraph
package I have tried this:
library(tidygraph)
df %>%
as_tbl_graph(directed = TRUE) %>%
activate(edges) %>%
mutate(recip_occur = edge_is_mutual()) %>%
as_tibble() %>%
filter(recip_occur == TRUE)
# A tibble: 4 x 3
from to recip_occur
<int> <int> <lgl>
1 1 8 TRUE
2 2 7 TRUE
3 7 2 TRUE
4 8 1 TRUE
However this divorces the edges from the nodes and repeats reciprocal co-occurrences. Does anyone have experience with this sort of data?