0

I am trying to do a kind of 'if' statement in R where I want to find if two values (string) are the same in two different columns. For example, if my Origin and my Destination country are the same, I want to create a new column with Domestic as a result. If false, then eventually I would code the NA as International.

I try several functions in R but still can't have it properly!

I think the recode function from car library could fit. Here is an example of data and two examples of lines of code I have tried. Thanks for the help.

#Data
Origin.Country <- c("Canada","Vietnam","Maldives", "Indonesia", "Spain",     "Canada","Vietnam")
Passengers <- c(100, 5000, 200, 10000, 200, 20, 4000)
Destination.Country <- c("France","Vietnam","Portugal", "Thailand", "Spain", "Canada","Thailand")

data2<-data.frame(Origin.Country, Destination.Country, Passengers)

#Creating new column
data2$Domestic<-NA 

#If Origin and Destination is the same = Domestic
data2$Domestic[data2$Origin.Country==data2$Destination.Country <- Domestic

data2$Domestic <- recode(data2$Origin.Country, c(data2$Destination.Country)='Domestic', else='International')

3 Answers3

1

You can use ifelse:

data2$Domestic <- ifelse(as.character(data2$Origin.Country) == 
                         as.character(data2$Destination.Country), 
                         'Domestic', 'International')

I used as.character to coerce the country name variables to be characters for comparison. ifelse takes a logical as the first argument, and returns the second argument if TRUE, and the third argument if FALSE. In this instance, it performs a comparison of the variables by row.

lmo
  • 37,904
  • 9
  • 56
  • 69
0

This might be a bit slow because it's not vectorised, but it worked based on your example:

data2$domestic <- apply(data2, 1, function(x) {
    ( x["Origin.Country"] == x["Destination.Country"] )
} )
Jasper
  • 555
  • 2
  • 12
  • I have tried on the example that I have provided and it works (with true and false as results). However, I am dealing with a very large dataset and it does not work.. no results are appearing.. – Catherine Gladu Jun 01 '16 at 14:22
  • Do you get any errors? Do your columns have the same names as in your example? – Jasper Jun 01 '16 at 14:28
0

You can use recode in this way:

library(dplyr); library(car)
data2 %>% mutate(Domestic = recode(as.character(Origin.Country) == as.character(Destination.Country), 
                                   "TRUE='domestic'; else='international'"))

  Origin.Country Destination.Country Passengers      Domestic
1         Canada              France        100 international
2        Vietnam             Vietnam       5000      domestic
3       Maldives            Portugal        200 international
4      Indonesia            Thailand      10000 international
5          Spain               Spain        200      domestic
6         Canada              Canada         20      domestic
7        Vietnam            Thailand       4000 international
Psidom
  • 209,562
  • 33
  • 339
  • 356