I'm used to Python and JS, and pretty new to R, but enjoying it for data analysis. I was looking to create a new field in my data frame, based on some if/else logic, and tried to do it in a standard/procedural way:
for (i in 1:nrow(df)) {
if (is.na(df$First_Payment_date[i]) == TRUE) {
df$User_status[i] = "User never paid"
} else if (df$Payment_Date[i] >= df$First_Payment_date[i]) {
df$User_status[i] = "Paying user"
} else if (df$Payment_Date[i] < df$First_Payment_date[i]) {
df$User_status[i] = "Attempt before first payment"
} else {
df$User_status[i] = "Error"
}
}
But it was CRAZY slow. I tried running this on a data frame of ~3 million rows, and it took way, way too long. Any tips on the "R" way of doing this?
Note that the df$Payment_Date
and df$First_Payment_date
fields are formatted as dates.