-1

I have a data frame in R called QCEW_County_Denominated. In this data frame I have a column called Industry. Whenever the value of this column is [31-33], [44-45], or [48-49] - actual values - not value ranges, I would like to change the value to 31, 44, and 48 respectively. Any advice on how to format this? If-then statements in R are my weakest point so I figured I'd ask here.

J.Jack
  • 51
  • 8

3 Answers3

0

check out case_when()

library('dplyr')
x <- data.frame(industry = rep(c("[31-33]","[44-45]","[48-49]"), each = 4))
x %>% 
 mutate(industry_n = case_when(.$industry == "[31-33]" ~ 31, 
                               .$industry == "[44-45]" ~ 44, 
                               .$industry == "[48-49]" ~ 48))

or if you have the dev version of dplyr (devtools::install_github("hadley/dplyr"), you can run:

x %>% 
 mutate(industry_n = case_when(industry == "[31-33]" ~ 31, 
                               industry == "[44-45]" ~ 44, 
                               industry == "[48-49]" ~ 48))
Lucy
  • 981
  • 7
  • 15
  • Hi Lucy, do you mind explaining what it is that the 2nd and 3rd line of the code is doing? – J.Jack Mar 06 '17 at 17:07
  • in second line I am creating a toy dataset that is similar to the one you described in your example (it has a variable called `industry` that has values `[31-33]`, `[44-45]`, and `[48-49]`). The third line takes that dataset and "pipes" it into `mutate()` function, which will create a new column called `industry_n`. Here is a little more on "pipes": http://r4ds.had.co.nz/pipes.html – Lucy Mar 06 '17 at 19:36
0

Or just like this:

df <- data.frame(Industry = rep(c("[31-33]","[44-45]","[48-49]"), each = 4), stringsAsFactors = F)
df$Industry[df$Industry=="[31-33]"] <- 31
df$Industry[df$Industry=="[44-45]"] <- 44
df$Industry[df$Industry=="[48-49]"] <- 48
user3640617
  • 1,546
  • 13
  • 21
0

Lucy's code is ideal.

However, if for some reason you're not going to use dplyr (though I don't see a reason why you shouldn't), you can use nested if functions:

x$new <- ifelse(x$industry == "[31-33]", 31, ifelse(x$industry == "[44-45]", 44, ifelse(x$industry == "[48-49]", 48, x$industry)))

etcetera

Community
  • 1
  • 1
Laurent
  • 1,914
  • 2
  • 11
  • 25