1

I have a categorical variable, let's say cat_var which can assume the following values: cat_var = ["A", "B", "C", "D"]

I run a series of regressions and patsy makes it easy to describe a regression: regr= " y ~ x + C(cat_var)

I was wondering what the easiest way to tune the use of categorical variable is . For example, let's say I would like to have patsy create dummies only for "A", "B", ie "C" and "D" are treated as one single group. I could remap cat_var to another set of value, but is there some sugar in patsy to do this task already?

NoIdeaHowToFixThis
  • 4,484
  • 2
  • 34
  • 69

1 Answers1

0

There are currently no ready made tools for this.

Similar question https://stackoverflow.com/questions/29015038/python-quantreg-categorical and related discussion see thread with https://groups.google.com/d/msg/pystatsmodels/awZU2jM6xr0/gthF1t1QNksJ

Efficient recipes would be welcome. I would use pandas to create a new "CD" variable..

Community
  • 1
  • 1
Josef
  • 21,998
  • 3
  • 54
  • 67