0

I'm working with gtap data and would like to combine it with other datasets. I am trying to find a way to use country ID's/codes, and it seems one option is to use the R package countrycodes. However, gtap is not included in the supported codelist in the package. I was trying to create a custom dictionary, but unsuccessfully.

Example gtap data:

gtap <- structure(list(COMM = c("coa", "coa", "coa", "coa", "coa", "coa"
), Source = c("afg", "afg", "afg", "afg", "afg", "afg"), Destination = c("afg", 
"alb", "are", "arg", "arm", "aus"), TotValue = c(9.99999997475243e-07, 
7.83022114774212e-05, 0.00216353917494416, 0.000611430441495031, 
2.76709855029367e-08, 2.72226079687243e-05)), row.names = c(NA, 
6L), class = "data.frame")

This is what I've tried:

library(countrycode)
library(tidyverse)

get_dictionary()

cd <- get_dictionary("gtap10")

gtap_iso3c <- gtap %>% 
  mutate(countrycode(Source, "gtap.cha", "iso3c"))
Error in `mutate()`:
ℹ In argument: `countrycode(Source, "gtap.cha", "iso3c")`.
Caused by error in `countrycode()`:
! The `origin` argument must be a string of length 1 equal to one of these values: cctld, country.name, country.name.de, country.name.fr, country.name.it, cowc, cown, dhs, ecb, eurostat, fao, fips, gaul, genc2c, genc3c, genc3n, gwc, gwn, imf, ioc, iso2c, iso3c, iso3n, p5c, p5n, p4c, p4n, un, un_m49, unicode.symbol, unhcr, unpd, vdem, wb, wb_api2c, wb_api3c, wvs, country.name.en.regex, country.name.de.regex, country.name.fr.regex, country.name.it.regex.
Run `rlang::last_trace()` to see where the error occurred.
> 
MoonS
  • 117
  • 7
  • What is the variable called `Source` in your code? – Quinten Aug 11 '23 at 16:42
  • What are you trying to get from this? IIUC, `countrycode::countrycode()` returns a `character` vector, so your `mutate` call should be assigning it to a variable (new or existing). – r2evans Aug 11 '23 at 16:48
  • 1
    FYI, `cd` does not contain `"afg"` in any of its rows; the closest is where `gtap.cha = "XSA"` which corresponds with `"Afghanistan"`, but I don't think `"afg"` itself would map perfectly, only by substring. – r2evans Aug 11 '23 at 16:54
  • 1
    You'd utilize a dictionary like `gtap10` by including `custom_dict = cd` in `countrycode()`. However, this would be done to convert between fields in *that* dictionary, not between `gtap10` and fields in `codelist`. As @r2evans points out, your sample data appears to be something other than `gtap.cha`, possibly `iso3c` already. – Seth Aug 11 '23 at 17:00
  • Thanks all! I might have to review how to use ```countrycode``` again, or if there is a better way to build a database that contains different country codes. – MoonS Aug 11 '23 at 21:14

1 Answers1

0

First of all, in order to use a custom dictionary with countrycode() one must use the argument custom_dict = cd where cd is a data frame containing the matching codes/names.

However, the "gtap10" custom dictionary you are using is not suitable for matching "gtap.cha" to "iso3c"... 1. because it does not contain iso3c codes, and 2. because the "gtap.cha" column contains numerous duplicate values, so it cannot be used as an "origin", e.g. if you were going from gtap.cha -> country.name, "aus" would result in multiple matches: Australia, Christmas Island, Cocos (Keeling) Islands, etc.

dplyr::tibble(countrycode::get_dictionary("gtap10"))
#> # A tibble: 244 × 5
#>    country.name             country.name.en.regex    gtap.name gtap.num gtap.cha
#>    <chr>                    <chr>                    <chr>        <int> <chr>   
#>  1 Australia                "australia"              Australia        1 AUS     
#>  2 Christmas Island         "christmas"              Australia        1 AUS     
#>  3 Cocos (Keeling) Islands  "\\bcocos|keeling"       Australia        1 AUS     
#>  4 Heard & McDonald Islands "heard.*mcdonald"        Australia        1 AUS     
#>  5 Norfolk Island           "norfolk"                Australia        1 AUS     
#>  6 New Zealand              "new.?zealand"           New Zeal…        2 NZL     
#>  7 American Samoa           "^(?=.*americ).*samoa"   Rest of …        3 XOC     
#>  8 Cook Islands             "\\bcook"                Rest of …        3 XOC     
#>  9 Fiji                     "fiji"                   Rest of …        3 XOC     
#> 10 French Polynesia         "french.?polynesia|tahi… Rest of …        3 XOC     
#> # ℹ 234 more rows
CJ Yetman
  • 8,373
  • 2
  • 24
  • 56