0

I have been working with some tidycensus data for an assignment and have gotten to the point where trying to generate a smooth line graph isn't plotting my dataset.

My current code is:

PA_county_list %>%
  filter(county %in% c("Chester County","Bucks County")) %>%
  ggplot() +
  geom_smooth(mapping = aes (x = total.pop , y = mean.white, color = county)) +
  labs(title = "Comparing Percent White Race in Chester County and Buck County",
       subtitle = "2010 ACS 5 year census survey",
       y = "White Race Claims",
       x = "Total Population")


This is a sample of the data I am using:

county            total.pop    mean.white            mean.income        per_white
<chr>               <dbl>          <dbl>                 <dbl>             <dbl>
Chester County      41413         3694.957             88997.22           3.716587

Bucks County        47969         3946.140             79940.48           3.969241 

The result of the printed script leads to a labeled blank graph. Where labels are intact but the data from total.pop (population) and mean.white (population of white race) are not listed.

At this point, any insight would be greatly appreciated.

Thanks.

  • 1
    Can you provide an example of your data structure? https://stackoverflow.com/help/minimal-reproducible-example – Sinval Feb 06 '21 at 23:11
  • Hi Sinval. Thanks for the response, this is my first post on stack so I wasn't 100% sure how to go about posting. I edited the original post to add a sample of two counties in the `head()` of my dataset. – Charles Cini Feb 06 '21 at 23:48
  • Go Bucks County! If you have one point per county, do you mean to use `geom_point`? – LMc Feb 07 '21 at 00:36
  • Hi LMc, I was thinking that at first but I'm using ACS centennial data from 2010 which was cleaned to represent an average of all rows containing Bucks County :(. I'm in the process of going back to the original un- `mean()` ed set to see if that would work. Otherwise I would just like to represent a graph showing correlation of population rise to white race represented – Charles Cini Feb 07 '21 at 00:46
  • LMc you rock!!!! I fumbled the dataset I was working with. I went back to the original un averaged dataset and it generated a graph I wanted! – Charles Cini Feb 07 '21 at 00:50

2 Answers2

0

So I figured out what I was doing wrong! Apparently, my dataset listed for the graph generation was one that was calculating averages for other problems on the assignment. It consisted of single averaged observations.

So the fix to this was to go back to my originally cleaned dataset and change the parameters to reflect the old variables before averages were taken.

0

You only have two points in your data judging by your plot title. If that's the case then you wouldn't/couldn't smooth. You can simply connect these dots using geom_line:

ggplot(df, mapping = aes (x = total.pop , y = mean.white)) +
  geom_point(aes(color = county)) +
  geom_line() +
  labs(title = "Comparing Percent White Race in Chester County and Buck County",
       subtitle = "2010 ACS 5 year census survey",
       y = "White Race Claims",
       x = "Total Population")

enter image description here

If you had many more data points you could smooth like this:

ggplot(df, mapping = aes (x = total.pop , y = mean.white)) +
  geom_smooth(method = "loess", formula = y ~ x, color = "black") +
  geom_point(aes(color = county)) 

enter image description here


Data

set.seed(1)
df <- data.frame(county = c("Chester", "Bucks", "Berks", "Montgomery", "Delaware", "Schuylkill"),
                 total.pop = rnorm(6, 48000, 3800)) %>% 
  dplyr::mutate(mean.white = rbeta(6, 5, 2) * total.pop)
LMc
  • 12,577
  • 3
  • 31
  • 43