1

I am currently using Python's Patsy module to create matrix inputs for my model. For example, a formula I might use is

'Survived ~ C(Pclass) + C(Sex) + C(honor) + C(tix) + Age + SibSp + ParCh + Fare + Embarked + vowel + middle + C(Title)'

However, I would like to perform model selection so I want to create all possible formulas from the most simple model of

'Survived ~ Age'

to the most complicated model of

'Survived ~ C(Pclass) * C(Sex) * C(honor) * C(tix) * Age * SibSp * ParCh * Fare * Embarked * vowel * middle * C(Title)'

Is there a command in Patsy or some way I can generate all possible string combinations?

Naomi
  • 93
  • 2
  • 9
  • I don't know Patsy but if you specify the rules for creating a valid formula string I might be able to help. – Denziloe Mar 08 '17 at 00:32

1 Answers1

1

This seems like a simple string-generation problem:

import itertools

survived = 'Survived'
operators = '+ - * / : **'.split()
factors = """C(Pclass) C(Sex) C(honor) C(tix) Age SibSp ParCh Fare Embarked vowel middle C(Title)""".split()

for l in range(1,len(factors)):
    for fax in itertools.permutations(factors, l):
        for ops in itertools.product(operators, repeat=(l-1)):
            expr = [val for pair in itertools.zip_longest(fax, ops) for val in pair if val is not None]
            print(survived, '~', ' '.join(expr))
aghast
  • 14,785
  • 3
  • 24
  • 56