Suppose I have a table of ages:
ages <- array(round(runif(min=10,max=200,n=100)),dim=100,dimnames=list(age=0:99))
Suppose now I want to collapse my ages table in 5-year wide age groups.
This could be done quite easily by summarizing over different values:
ages.5y <- array(NA,dim=20,dimnames=list(age=paste(seq(from=0,to=95,by=5),seq(from=4,to=99,by=5),sep=""))
ages.5y[1]<-sum(ages[1:5])
ages.5y[2]<-sum(ages[6:10)
...
ages.5y[20]<-sum(ages[96:100])
It could also be done using a loop:
for(i in 1:20) ages.5y[i]<-sum(ages[(5*i-4):(5*i)])
But while this method is easy for "regular" transformations, the loop approach becomes infeasible if the new intervals are irregular, eg. 0-4,5:12,13-24,25-50,60-99.
If, instead of a table, I had individual values, this could be done quite easily using cut
:
flattened <- rep(as.numeric(dimnames(ages)$age),ages)
table(cut(flattened,breaks=seq(from=0,to=100,by=5)))
This allows the use of any random break points, eg breaks=c(5,10,22,33,41,63,88)
However, this is a quite ressource intense way to do it.
So, my question is: Is there a better way to recode a contingency table?