I have a numeric variable (df2$year), which I am trying to categorise into groups using the cut function. Although I have got this to work correctly, the cut groups variable (df2$decade - stored as a factor) is displaying with the exponential notation, rather than as numerics. I have tried using options(scipen=99999), but with no success. Here is the code and some output:
options(scipen=99999)
df2$year <- as.numeric(substr(df2$date, 1, 4))
summary(df2$year)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1973 2002 2010 2007 2014 2018 99866
str(df2$year)
num [1:100000] NA NA NA NA NA NA NA NA NA NA ...
table(df2$year)
1973 1980 1982 1986 1987 1988 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
1 1 1 1 1 1 5 2 3 1 1 4 2 1 3 1 1 5 6 2 1 4 8 7 4 7 8 10 6 11 8
2016 2017 2018
7 9 1
df2$decade <- cut(df2$year,breaks=c(1959,1969,1979,1989, 1999, 2009, 2019))
summary(df2$decade)
(1.96e+03,1.97e+03] (1.97e+03,1.98e+03] (1.98e+03,1.99e+03] (1.99e+03,2e+03] (2e+03,2.01e+03] (2.01e+03,2.02e+03] NA's
0 1 5 22 39 67 99866
Please can somebody suggest a way in which my resulting df2$decade variable is formatted such that the levels are (1960,1970], etc...
Thanks