I want to generate a set of dyad identifiers for a bilateral trade flow dataframe (that is coded in from
, to
, and amount
traded format) such that I could use these identifiers for further statistical analysis.
My example data is provided at below, from which I have extracted and identified unique country dyads from the data that involve the US.
# load the example data
trade_flow <- readRDS(gzcon(url("https://www.dropbox.com/s/ep7xldoq9go4f0g/trade_flow.rds?dl=1")))
# extract country dyads
country_dyad <- trade_flow[, c("from", "to")]
# identify unique pairs
up <- country_dyad[!duplicated(t(apply(country_dyad, 1, sort))),]
# extract only unique pairs that involve the US
up <- up[(up$from == "USA") | (up$to == "USA"), ]
## how can I use the unique pair object (up) to generate dyad identifiers and include them as a new column in the trade_flow dataframe
The next step is match these unique dyad pairs from the original dataframe's (trade_flow
) from
and to
columns and generate a list of unique dyad identifiers as a new column (say, dyad
) to the df (trade_flow
). It should look something like the format below in which each unique dyad is identified and coded as a unique numerical value. I will be grateful if someone could help me on this.
from to trade_flow dyad
USA ITA 5100 2
USA UKG 4000 1
USA GMY 17000 3
USA ITA 4500 2
USA JPN 2900 4
USA UKG 6700 1
USA ROK 7000 5
USA UKG 2300 1
USA SAF 1500 6
IND USA 2400 7