I have data that looks like this
conflict_ID country_code SideA
1 1 1
1 2 1
1 3 0
2 4 1
2 5 0
I used the following code by help of this forum:
library(dplyr)
library(tidyr)
mydf %>%
group_by(conflict_ID) %>%
summarise(country_code = combn(country_code, 2, sort, simplify = FALSE),
.groups = 'drop') %>%
unnest_wider(country_code, names_sep = '_') %>%
anti_join(filter(mydf, SideA == 1),
by = c("conflict_ID", "country_code_2" = "country_code"))
# # A tibble: 3 × 3
# conflict_ID country_code_1 country_code_2
# <int> <int> <int>
# 1 1 1 3
# 2 1 2 3
# 3 2 4 5
to end up with the result you can see above. However, in the actual data, not all conflicts end up listed in the data frame that is created. They only appear, if in the original table, SideA was the first country in the list (table showed 1) and in the next row, the other party had a 0 (indicating that they are not SideA). If it is the other way around, the dyad simply doesn't show up in the table that is created. Any ideas of why that might be? I know that the problem is within the anti_join function, but I don't know what the problem actually is.
EDIT: to be more precise, here is an example that works (countries end up in resulting table) and on where it doesn't work:
# with this input, it works
dispnum ISO3 sidea
4414 AZE 0
4414 ARM 1
# with this input, it does not work
dispnum ISO3 sidea
4613 ARG 0
4613 GHA 1
And I think that the first part of the code does something to the data that the anti_join picks up in a weird way. Maybe it is, because it goes through the data alphabetically, and this works because ARM comes has the 1 and comes before AZE and doesnt work in the other case because GHA (which has sidea = 1) comes after ARG?