2

I'm new to programming and don't quite understand the logic behind functions in R just yet. I want to create a function which is able to handle four variables, with conditions in three of these deciding the result in the fourth variable; and that goes through all cases in the data set. For the sake of simplicity, lets say my data frame has four variables (var1 to var4) with 100 cases each:

f1 <- function(w, x, y, z) {
   for (n in seq_along(w)) {
    if (!is.na(w[n]) & !is.na(x[n]) & !is.na(y[n])){
  z[n]<-0
}else if (!is.na(w[n]) & !is.na(x[n]) & is.na(y[n])){
  z[n]<-1
}else if (!is.na(w[n]) & is.na(x[n]) & is.na(y[n])){
  z[n]<-2
}else if (!is.na(w[n]) & is.na(x[n]) & !is.na(y[n])){
  z[n]<-3
} } }

f1(df$var1, df$var2, df$var3, df$var4)

Why doesn't the function work?

knoxgon
  • 1,070
  • 2
  • 15
  • 31
Rico
  • 69
  • 1
  • 6

1 Answers1

2

The variable z that you are modifying inside f1 is a local copy and doesn't get updated in the original data-set df. You need to return the modified z and assign it to var4 in df.

f1 <- function(w, x, y, z) {
  for (n in seq_along(w)) {
    if (!is.na(w[n]) & !is.na(x[n]) & !is.na(y[n])){
      z[n]<-0
    } else if (!is.na(w[n]) & !is.na(x[n]) & is.na(y[n])) {
      z[n]<-1
    } else if (!is.na(w[n]) & is.na(x[n]) & is.na(y[n])) {
      z[n]<-2
    } else if (!is.na(w[n]) & is.na(x[n]) & !is.na(y[n])) {
      z[n]<-3
    } 
  } 
  z  # <--- return value 
}

Then call f1 as:

df$var4 <- f1(df$var1, df$var2, df$var3, df$var4)
B.Shankar
  • 1,271
  • 7
  • 11