0

I have a data frame that contains state names and I would like to create a new variable called "region" in which a value is assigned based on the state that is found under the "state" variable.

For example, if the state variable has "Alabama" or "Georgia", I would like to have "Region" assigned as "South". If state is "Washington" or "California", I would like it assigned to "West". I have to do this for each of the 48 contiguous U.S. states, and I'm having difficulty figuring out the best way to do this. Any help in this (I'm sure simple) procedure would be great. To make this clearer, the current data frame only has the following information:

State
Michigan
Wyoming
California
Georgia
Alabama

I need to have code that adds a region variable to the data frame and then assigns a region name based on the state. I tried the following code, but keep getting an error message:

preplogdat$region[preplogdat$State==c("Washington","Wyoming","California","Idaho")] <- "West"

I ultimately need code that assigns these region labels so that the final product looks like the following:

State      Region
Michigan   Midwest
Wyoming    West
California West
Georgia    South
Alabama    South
Justin
  • 11
  • 4
  • Use `%in%` rather than `==` – MrFlick Sep 02 '22 at 14:28
  • It would probably be even easier if you had a look up data.frame with one column for state and one for region. Then you could just merge/join that to your data.frame rather than doing a bunch of individual assigns. – MrFlick Sep 02 '22 at 14:31

0 Answers0