2

This question is related to: Convert long state names embedded with other text to two-letter state abbreviations

Following for loop code works well.

for(r in 1:nrow(states.list)) {
    states = sub(states.list[r,1], states.list[r,2], states)
}

states
[1] "Plano NJ"      "NC"            "xyz"           "AL 02138"      "TX"            "Town IA 99999"

Data:

states <- c("Plano New Jersey", "NC", "xyz", "Alabama 02138", "Texas", "Town Iowa 99999")

states.list = structure(list(state.name = structure(c(4L, 1L, 5L, 2L, 3L), .Label = c("Alabama", 
"Iowa", "Minnesota", "New Jersey", "Texas"), class = "factor"), 
    state.abb = structure(c(4L, 1L, 5L, 2L, 3L), .Label = c("AL", 
    "IA", "MN", "NJ", "TX"), class = "factor")), .Names = c("state.name", 
"state.abb"), class = "data.frame", row.names = c(NA, -5L))

states.list
  state.name state.abb
1 New Jersey        NJ
2    Alabama        AL
3      Texas        TX
4       Iowa        IA
5  Minnesota        MN

I tried following to have a vector solution, but they do not work:

apply(states.list, 1, function(x) {
    sapply(states, function(y) {
        sub(   x[1], x[2],   y
        )
        })
})

sapply(states, function(x) sub(states.list[,1], states.list[,2], x))

apply(states.list, 1, function(x) sub(x[1],x[2], states))

How can I convert it to a vector solution (using apply etc, without using any special packages)? Thanks for your help.

Edit: output of akrun's solution:

sapply ( seq_len(nrow(states.list)), function(i) {
+ sub(states.list[i,1], states.list[i,2], states[i])
+ })
[1] "Plano NJ"      "NC"            "xyz"           "Alabama 02138" "Texas"        
Community
  • 1
  • 1
rnso
  • 23,686
  • 25
  • 112
  • 234

1 Answers1

2

I doubt this can be vectorized. At best you can hide the for loop under an *apply equivalent, or using Reduce like here:

ARGS <- split(states.list, seq_len(nrow(states.list)))
FUN  <- function(x, y) gsub(as.character(y$state.name),
                            as.character(y$state.abb), x)
Reduce(FUN, ARGS, states)

It's fancy and all but it is IMHO not worth the effort: it is probably not faster than a for loop and it is much harder to understand, isn't it? There's a little too much stigma around using for in R.

flodel
  • 87,577
  • 21
  • 185
  • 223