R - Why can't I match country codes to custom dictionary?

Question

I'm working with gtap data and would like to combine it with other datasets. I am trying to find a way to use country ID's/codes, and it seems one option is to use the R package countrycodes. However, gtap is not included in the supported codelist in the package. I was trying to create a custom dictionary, but unsuccessfully.

Example gtap data:

gtap <- structure(list(COMM = c("coa", "coa", "coa", "coa", "coa", "coa"
), Source = c("afg", "afg", "afg", "afg", "afg", "afg"), Destination = c("afg", 
"alb", "are", "arg", "arm", "aus"), TotValue = c(9.99999997475243e-07, 
7.83022114774212e-05, 0.00216353917494416, 0.000611430441495031, 
2.76709855029367e-08, 2.72226079687243e-05)), row.names = c(NA, 
6L), class = "data.frame")

This is what I've tried:

library(countrycode)
library(tidyverse)

get_dictionary()

cd <- get_dictionary("gtap10")

gtap_iso3c <- gtap %>% 
  mutate(countrycode(Source, "gtap.cha", "iso3c"))

Error in `mutate()`:
ℹ In argument: `countrycode(Source, "gtap.cha", "iso3c")`.
Caused by error in `countrycode()`:
! The `origin` argument must be a string of length 1 equal to one of these values: cctld, country.name, country.name.de, country.name.fr, country.name.it, cowc, cown, dhs, ecb, eurostat, fao, fips, gaul, genc2c, genc3c, genc3n, gwc, gwn, imf, ioc, iso2c, iso3c, iso3n, p5c, p5n, p4c, p4n, un, un_m49, unicode.symbol, unhcr, unpd, vdem, wb, wb_api2c, wb_api3c, wvs, country.name.en.regex, country.name.de.regex, country.name.fr.regex, country.name.it.regex.
Run `rlang::last_trace()` to see where the error occurred.
>

What are you trying to get from this? IIUC, `countrycode::countrycode()` returns a `character` vector, so your `mutate` call should be assigning it to a variable (new or existing). — r2evans, Aug 11 '23 at 16:48
FYI, `cd` does not contain `"afg"` in any of its rows; the closest is where `gtap.cha = "XSA"` which corresponds with `"Afghanistan"`, but I don't think `"afg"` itself would map perfectly, only by substring. — r2evans, Aug 11 '23 at 16:54
You'd utilize a dictionary like `gtap10` by including `custom_dict = cd` in `countrycode()`. However, this would be done to convert between fields in *that* dictionary, not between `gtap10` and fields in `codelist`. As @r2evans points out, your sample data appears to be something other than `gtap.cha`, possibly `iso3c` already. — Seth, Aug 11 '23 at 17:00
Thanks all! I might have to review how to use ```countrycode``` again, or if there is a better way to build a database that contains different country codes. — MoonS, Aug 11 '23 at 21:14

score 0 · Answer 1 · answered Aug 16 '23 at 14:12

First of all, in order to use a custom dictionary with countrycode() one must use the argument custom_dict = cd where cd is a data frame containing the matching codes/names.

However, the "gtap10" custom dictionary you are using is not suitable for matching "gtap.cha" to "iso3c"... 1. because it does not contain iso3c codes, and 2. because the "gtap.cha" column contains numerous duplicate values, so it cannot be used as an "origin", e.g. if you were going from gtap.cha -> country.name, "aus" would result in multiple matches: Australia, Christmas Island, Cocos (Keeling) Islands, etc.

dplyr::tibble(countrycode::get_dictionary("gtap10"))
#> # A tibble: 244 × 5
#>    country.name             country.name.en.regex    gtap.name gtap.num gtap.cha
#>    <chr>                    <chr>                    <chr>        <int> <chr>   
#>  1 Australia                "australia"              Australia        1 AUS     
#>  2 Christmas Island         "christmas"              Australia        1 AUS     
#>  3 Cocos (Keeling) Islands  "\\bcocos|keeling"       Australia        1 AUS     
#>  4 Heard & McDonald Islands "heard.*mcdonald"        Australia        1 AUS     
#>  5 Norfolk Island           "norfolk"                Australia        1 AUS     
#>  6 New Zealand              "new.?zealand"           New Zeal…        2 NZL     
#>  7 American Samoa           "^(?=.*americ).*samoa"   Rest of …        3 XOC     
#>  8 Cook Islands             "\\bcook"                Rest of …        3 XOC     
#>  9 Fiji                     "fiji"                   Rest of …        3 XOC     
#> 10 French Polynesia         "french.?polynesia|tahi… Rest of …        3 XOC     
#> # ℹ 234 more rows

R - Why can't I match country codes to custom dictionary?

1 Answers1