0

I would like to convert a list of country codes into numerical values to use in regression analysis. For examples all country codes that say "US" get 1, "DE" get 2, "FR" get 3 etc. without using 'distinct'.

I tried the cast function:

cast(countrycode as numeric(1))

But it didn't work and I don't want to do it in Excel with the IF-function for 7000 values.

I already have a list of country codes. I just want to join numeric values against it so when I retrieve the data, it shows like:

US | 1
FR | 2
DE | 3
US | 1
NL | 4
FR | 2

For 144 (distinct) country codes..

Naj
  • 37
  • 2
  • 10
  • 2
    Create a lookup table with the mapping from the code to the number then join your table to that lookup table –  Dec 10 '15 at 12:53
  • corax228 first suggestion is good enough if you don't have more than 10 cases of different codes otherwise sql is unreadable . if you have more than that, you might write a small. function to simplify your sql or you could make a table with countrycodes and join against it. – LongBeard_Boldy Dec 10 '15 at 12:56
  • I already have a list of country codes. I just want to join numeric values against it so when I retrieve the data, it shows like: US | 1 FR | 2 DE | 3 US | 1 NL | 4 FR | 2 For 144 distinct country codes. – Naj Dec 10 '15 at 15:18

1 Answers1

0

If you know for certain that the countries are in order, or you just want to assign arbitrary integer values, you can create an enum type:

create type country as enum  ('US', 'DE', 'FR');

When you insert the strings, you can cast them to the type values: select 'US'::country; "US"

Internally, these are treated as sequential numbers:

select enumlabel, enumsortorder
from pg_enum e join pg_type t on e.enumtypid = t.oid
where typname = 'country'

If you have a query you're doing the regression on, you could join it to this to do the lookup.

garysieling
  • 346
  • 2
  • 7