-5

I have a data frame with variables:

$ ID                  : int  9224101
$ IUCR                : Factor w/ 360
$ Primary.Type        : Factor w/ 32
$ Year                : int  2013 

IUCR (Illinois Uniform Crime Reporting code)

I want to plot a time series that shows all the Years on x axis and the number of crimes that happened each year on Y axis at=10^(0:6) , so the numbers wouldnt be as high.

I've tried using:

plot.ts(dd$Year, dd$ID)

enter image description here

Ive also tried

ggplot(data = dd, aes(Year, ID)) +geom_line()
camnesia
  • 2,143
  • 20
  • 26
  • 2
    What is the question? Please explain what you are looking for. – Prradep Nov 04 '16 at 16:21
  • 1
    I doubt the ID variable is the one you want to plot. Is there another field that actually lists the number of crimes? I don't see it in your variable list. – Puddlebunk Nov 04 '16 at 16:39
  • There is no variable that specifically counts the number of crimes - I assumed R was able to do it for me by looking at Year and counting how many ID match that year. It's my first time using R. Should I in that case count the total for each year and put the values in 2 new variables and use those to plot ? – camnesia Nov 04 '16 at 17:53
  • > aggregate(cbind(Year) ~ Year, data = dd, sum) Year Year 1 2001 77814888 2 2002 78546468 3 2003 76440489 4 2004 76330356 5 2005 73342900 6 2006 73180886 7 2007 71164206 8 2008 70514936 9 2009 64517026 10 2010 60402510 11 2011 57595040 12 2012 54605680 13 2013 50091492 14 2014 44728926 15 2015 42345225 16 2016 32128992 I think this is the number of crimes – camnesia Nov 04 '16 at 18:02

1 Answers1

1

If each observation represents one crime then you could do something like:

library(dplyr)
dd$count <- 1
dd_by_year <- dd %>% group_by(Year) %>% summarize(crime = sum(count, na.rm = T))

Then you should have crime by year that you can plot in any manner you like.

Puddlebunk
  • 493
  • 3
  • 10