I am studying statistics (using SPSS) with OU but want to learn how to perform tests using R.
For dose-response analysis in stratified contingency tables, I want to reproduce using R the linear-by-linear association test of SPSS. I have managed to produce the same answers as SPSS using the attached code but I would like to find out how to format the dataframe in a way that allows me to use the mantelhaen.test() function from the R 'Stats' package.
library(tidyverse)
library(foreign)
library(vcdExtra) # for CMHtest
# import data from SPSS (CB1, Activity 4.1)
df.sav <- read.spss('O:/M249/SPSS/Data/Book1/smoking2.sav', use.value.labels = TRUE, to.data.frame = TRUE)
# Create contingency table
table.sav <- xtabs(count ~ dose + casecon, df.sav)
strata <- sort(unique(df.sav$dose))
# Pearson chi-squared test of association (Pearson Chi-Square in SPSS)
chi.stat <- unname(chisq.test(table.sav)$statistic)
chi.df <- unname(chisq.test(table.sav)$parameter)
chi.p <- unname(chisq.test(table.sav)$p.value)
# test of linear trend (Linear-by-Linear Association in SPSS)
lin_ass.stat <- CMHtest(table.sav, rscores = strata, types = "cor")$table[1]
lin_ass.df <- CMHtest(table.sav, rscores = strata, types = "cor")$table[2]
lin_ass.p <- CMHtest(table.sav, rscores = strata, types = "cor")$table[3]
This is the dataframe imported from SPSS:
> df.sav
count dose casecon
1 32 50.0 Cases: Died of lung cancer
2 13 50.0 Controls
3 136 37.0 Cases: Died of lung cancer
4 71 37.0 Controls
5 196 19.5 Cases: Died of lung cancer
6 190 19.5 Controls
7 250 9.5 Cases: Died of lung cancer
8 293 9.5 Controls
9 35 2.0 Cases: Died of lung cancer
10 82 2.0 Controls
My 'lin_ass.stat' produces the same value as Linear-by-Linear Association in SPSS but I would like to run a similar analysis using the R/Stats function 'mantelhaen.test()'. However, I cannot work out how to produce a 'table.sav' in the correct format to use with 'mantelhaen.test()'. I thought that I was getting close with Arranging a 3 dimensional contingency table in R in order to run a Cochran-Mantel-Haenszel analysis? but my dataframe already has an aggregated 'count' field. If I use my current 'table.sav', I get an error:
mantelhaen.test(table.sav)
Error in mantelhaen.test(table.sav) : 'x' must be a 3-dimensional array
How should I prepare a tablefrom my dataframe that I can present to 'mantelhaen.test()'?