I have a data frame in R called QCEW_County_Denominated. In this data frame I have a column called Industry. Whenever the value of this column is [31-33], [44-45], or [48-49] - actual values - not value ranges, I would like to change the value to 31, 44, and 48 respectively. Any advice on how to format this? If-then statements in R are my weakest point so I figured I'd ask here.
Asked
Active
Viewed 113 times
3 Answers
0
check out case_when()
library('dplyr')
x <- data.frame(industry = rep(c("[31-33]","[44-45]","[48-49]"), each = 4))
x %>%
mutate(industry_n = case_when(.$industry == "[31-33]" ~ 31,
.$industry == "[44-45]" ~ 44,
.$industry == "[48-49]" ~ 48))
or if you have the dev version of dplyr
(devtools::install_github("hadley/dplyr"
), you can run:
x %>%
mutate(industry_n = case_when(industry == "[31-33]" ~ 31,
industry == "[44-45]" ~ 44,
industry == "[48-49]" ~ 48))

Lucy
- 981
- 7
- 15
-
Hi Lucy, do you mind explaining what it is that the 2nd and 3rd line of the code is doing? – J.Jack Mar 06 '17 at 17:07
-
in second line I am creating a toy dataset that is similar to the one you described in your example (it has a variable called `industry` that has values `[31-33]`, `[44-45]`, and `[48-49]`). The third line takes that dataset and "pipes" it into `mutate()` function, which will create a new column called `industry_n`. Here is a little more on "pipes": http://r4ds.had.co.nz/pipes.html – Lucy Mar 06 '17 at 19:36
0
Or just like this:
df <- data.frame(Industry = rep(c("[31-33]","[44-45]","[48-49]"), each = 4), stringsAsFactors = F)
df$Industry[df$Industry=="[31-33]"] <- 31
df$Industry[df$Industry=="[44-45]"] <- 44
df$Industry[df$Industry=="[48-49]"] <- 48

user3640617
- 1,546
- 13
- 21
0
Lucy's code is ideal.
However, if for some reason you're not going to use dplyr (though I don't see a reason why you shouldn't), you can use nested if functions:
x$new <- ifelse(x$industry == "[31-33]", 31, ifelse(x$industry == "[44-45]", 44, ifelse(x$industry == "[48-49]", 48, x$industry)))
etcetera