0

I have a database with variables which are categorical and have a massive amount of categories.

I would love to recategorise it in less amount of categories in this case 2, and base the decision to place a category in one of the new based on the mean value they have on another variable.

When I have low amount of categories (in this case 10) I use this script

data$V152=as.numeric(data$V152)

data$V152=as.numeric(revalue(as.character(data$V152),
           c("2"="0","3"="1", "4"="0","5"="1","6"="1","7"="0", "8"="0","9"="0","10"="0")))

But how do i do it with a categorical which has massive amount of categories ?

Looking at the picture I want the categories with a mean above to line to be recategorised as 1 and the others as 2.

Boxplots of categories

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Tim
  • 3
  • 1
  • 1
    If data$V152 is being plotted on that image, then it already is `numeric`. The request to make a discrete coding of 1's and 2's for values above or below mean() has been asked and answered many times before: `data$V152 <- 1+( data$V152 < mean(data$V152) )`. In general I do not advise such destructive recoding unless you are vary sure you have a simple recovery method in place. – IRTFM Mar 03 '16 at 19:15
  • Hi, the variable being plotted is another one. I just used how i recode a categorical variable with only 10 different categories. – Tim Mar 03 '16 at 19:21
  • I'm unable to connect the unlabeled boxplot to rest of the content of your question. Is this a two part question about unrelated topics. – IRTFM Mar 03 '16 at 20:24
  • The R code I putted in is, how i solve the problem in case i have a categorical variable with a small amount of categories (10 categories). The boxplot is a another categorical variable with each boxplot representing a category. Because i have so many categories the revalue function becomes to difficult. So I would like to recode this categorical variable by using the mean score each category has on another value (y axis). If the mean score of the category is below 450 I wanna put in the new category 2 is it above i wanna but it in de category 1. – Tim Mar 03 '16 at 21:01
  • Use `ave`. There are lots of worked examples on SO. – IRTFM Mar 03 '16 at 22:19

0 Answers0