I have census data for race in state populations since 1990. I want to do two things at the year/state level in R studio: 1. aggregate all those who are hispanic/latino of any racial group into an entirely new racial group, "Hispanic/Latino," 2. Create percentages of each racial group from the total population. For example, I want to know the proportion of non-hispanic blacks in Alabama in 1990 The image shows what my data looks like
Asked
Active
Viewed 69 times
1 Answers
0
I'm not 100% clear what you need your end result for #1 to be... but if what you ultimately need is is the "Race" column to indicate "Hispanic or Latino" you could do:
Data$Race[(Data$Ethnicity == "Hispanic or Latino")] <- "Hispanic or Latino"
You could also combine what is in the Ethnicity and Race columns like this:
Data$Race[(Data$Ethnicity == "Hispanic or Latino")]<- paste(Data$Race[((Data$Ethnicity == "Hispanic or Latino")],Data$Ethnicity[(Data$Ethnicity == "Hispanic or Latino")])
For #2...
#Load library
library(dplyr)
#Make test data
Data <- data.frame(Year = c(1990,1990,1991,1991),
State = c("AL", "MO", "AL", "MO"),
Population = c(1,2,2,3),
Race = c("Black", "Hispanic", "Hispanic", "Black"))
#Calculate total population
total_pop <- sum(Data$Population)
# Group by and calculate statistic, save to new 'df' dataframe
df <- Data %>%
group_by(Year, State, Race) %>%
summarise(percent = sum(Population)/total_pop)

melmo
- 757
- 3
- 15