0

I have been asked to report the descriptive statistics of my results in terms of IQR and median for my categorical variables but I do not know how I can do that! I know the logic but in continuous data.

Can anyone explain how to calculate that on categorical variables? And how to do it in R?

Aura
  • 49
  • 7
  • Reporting the interquartile range for categorical variables makes little sense. Consider, for example, a dataset that contains records for 50 males and 50 females. What would a sensible (and informative) IQR for sex look like? – Limey Jan 19 '22 at 09:53
  • @Limey so it should be the 50% of the middle part of data? So 25% male and 25% female?...But the problem for one of the variables I have 52 categories containing from 1 to 2000 patients in each... – Aura Jan 19 '22 at 10:05

1 Answers1

0

I am assuming you want to calculate median and IQR for variables grouped by a categorical variable. In base R, you can use aggregate for this. You can also look up tidyverse, which has the handy group_by and summarize functions.

df <- data.frame(
  c("m", "f", "m", "x"),
  c(20, 21, 64, 42),
  c(191, 180, 176, 177)
)
names(df) <- c("gender", "age", "length")
aggregate(length ~ gender, df, IQR)
aggregate(length ~ gender, df, median)

This has the output

aggregate(length ~ gender, df, IQR)
  gender length
1      f    0.0
2      m    7.5
3      x    0.0

aggregate(length ~ gender, df, median)
  gender length
  gender length
1      f  180.0
2      m  183.5
3      x  177.0
typewriter
  • 338
  • 2
  • 8