I hope one of you can help me - I have been trying loads of different ways of doing this and can't seem to find the right answer. I am fairly new to R, but have been writing a script to format some data that I have. Ultimately, I will want to run this script weekly as the data comes in.
I have a list of breed codes (1 - 80) many of which (but not all) have a corresponding 3 character country (eg. GBR or NLD etc). What I want to do is to create a new colum in my data from which has the country code in, which corresponds to the breed code.
One of the problems I'm having is that not all of the numbers (1 - 80) have a corresponding country code. So I can't create a vector with them all in as they are not of the same type.
If there is no associated country code, I would like the country code to be the number of the breed code. For example, breed code 6 has no associated country, so I woud want "6" to populate the relevant field in my new sire_country column.
In case it helps, I have added the script I have been trying to use, to no avail!
#denoting country codes for breed codes 1-80
breed_country<-c("GBR", "GBR", "GBR", "GBR", "GBR", "6", "GBR", "8", "9",
"10",
"11", "GBR", "NZL", "GBR", "GBR", "16", "DNK", "18", "19", "GBR", "21",
"GBR",
"23", "24", "25", "26", "CHE", "28", "29", "30", "31", "32", "33", "34",
"35",
"36", "37", "38", "39", "40", "41", "42", "CZE", "44", "45", "IRL", "AUS",
"POL", "DEU", "50", "51", "SWE", "DEU", "ESP", "55", "56", "57", "58",
"SWE",
"DEU", "DNK", "NZL", "NLD", "CAN", "USA", "66", "67", "68", "USA", "70",
"FRA",
"ITA", "FIN", "JEY", "GGY", "76", "NOR", "78", "79", "80")
breed_id<-c("Sire.Breed")
sire_country<-breed_country[breed_id]
sire_country[is.na("Sire.ID")]<-""
#the output looks like
sire_country
[1] NA
#when I add sire_country to my data frame, I get
sire_country
1 <NA>
2 <NA>
3 <NA>
4 <NA>
5 <NA>
6 <NA>
7 <NA>
8 <NA>
9 <NA>
10 <NA>
11 <NA>
12 <NA>
13 <NA>
14 <NA>
15 <NA>
# "Sire.Breed" is a column containing numerical breed codes in the data
frame: df
# sire_country is what I want the new column with the country codes in to be
called
# if there is no "Sire.ID" present, I want the field to remain blank - I
have used this function elsewhere and it work fine
My data is read from a .csv file. Unfortunately I can not post it, as it is confidential. But a fictional example would be:
animal name breed Mother Father ID Company DOB
1 Alice 2 Vera Tom 123456789012 Heinz 12/05/2017
2 Kate 63 Lucy Jack 123456987147 Google 03/06/2017
(I can't format the table better, sorry)
Then I would want country code, which relates to the breed (2 or 63 in this case) to be added at the end like so:
animal name breed Mother Father ID Company DOB Country
1 Alice 2 Vera Tom 123456789012 Heinz 12/05/2017 GBR
2 Kate 63 Lucy Jack 123456987147 Google 03/06/2017 NLD
Apologies if I have used the wrong language throughout this, I'm still learning! Any help you can give me would be very much appreciated.
Thank you!