A Python library for describing statistical models and building design matrices, aimed at bringing the convenience of R “formulas” to Python.
Questions tagged [patsy]
113 questions
0
votes
2 answers
Moving a bunch of distinct items to the end of a python list
I have this python list:
['Intercept', 'a', 'country[T.BE]', 'country[T.CY]', 'country[T.DE]', 'b', 'c', 'd', 'e']
I want the country items at the end:
['Intercept', 'a', 'b', 'c', 'd', 'e', 'country[T.BE]', 'country[T.CY]', 'country[T.DE]']
How to…

Martien Lubberink
- 2,614
- 1
- 19
- 31
0
votes
1 answer
patsy dmatrices raising "AssertionError"
Noob trying my first Negative Binomial Regression. iPython on Google's Colab. I load the dataset as a pandas df. The features (and Target) in the formula below all appear in the df (which I named "dataset").
I also bring in
from patsy import…

RandomForestRanger
- 257
- 1
- 5
- 16
0
votes
1 answer
How to write multivariate formula in python (patsy), does VAR support it?
I want to do multivariate data analysis using vector auto regression (VAR), but want more freedom. For example, the question I am dealing with can look like:
y1(t) = a11*y1(t-1) + a12*y1(t-2) + b11*y2(t-1) + c11*x1(t) + c12*x2(t) +…

user2355104
- 82
- 1
- 9
0
votes
3 answers
PatsyError: Error evaluating factor: NameError:
I am an absolute newbie in Python programming and currently learning basic statistics on it.
I am facing a
"PatsyError: Error evaluating factor: NameError:"
on a code with pred = model.predict(pd.DataFrame(calo['wt'])
Below is my code:
#…

Sanjeev Raikar
- 9
- 1
- 1
- 5
0
votes
1 answer
Why Name of dummy variables with/without T?
Using patsy, I noticed that it named dummy variables sometimes with T and without T in other cases. And today I realised that T is attached when the constant term is present in a regression equation, and no T without the constant term. For example,…

T_T
- 1,202
- 16
- 22
0
votes
1 answer
R - Using patsy.dmatrices() with reticulate
I have a problem of namespace when trying to use function patsy.dmatrices() with the reticulate R package.
Here is a simple reproducible example:
patsy <- import("patsy")
# Data
dataset <- data.frame(Y=rnorm(1000,2.5,1))
# Null model
formula_null <-…

Ghislain Vieilledent
- 96
- 1
- 6
0
votes
2 answers
Create a custom function in Patsy
import patsy
from patsy import dmatrices, dmatrix, demo_data
dt=pd.DataFrame({'F1':['a','b','c','d','e','a'],'F2':['X','X','Y','Y','Z','Z']})
I know I can do this
dmatrix("1+I(F1=='a')",dt)
but can I create a arbitrary function patsy? I'm trying…

xappppp
- 481
- 7
- 18
0
votes
1 answer
convert cost function to statsmodels formula
I want to fit some data to a curve, using this as a cost function:
def cost_func(x):
return ((unknown_conc-x[1]*(x[0]*conc_A+
(1-x[0])*conc_B))**2).sum()
It works when using scipy.optimize, but I want to use statsmodels instead.…

phenix
- 3
- 2
0
votes
1 answer
How to modify a liner regression in python 3.6?
The code looks like:
import statsmodels.formula.api as smf
df = pd.read_csv('reg_data.csv')
f = 'inf ~ rh*temp*tl*Tt*C(location)'
lm = smf.ols(formula = f, data=df).fit()
But it always gives me an error:
numbers besides '0' and '1' are only…

Jenny
- 169
- 1
- 3
- 10
0
votes
1 answer
How to stop patsy from creating redundant interactions of categorical variables
I'm using patsy to fit regressions with statsmodels using the formula api.
My problem is that my design matrix is singular because patsy creates (locally?) redundant interactions of categoricals.
import patsy
import pandas as pd
data =…

Artturi Björk
- 3,643
- 6
- 27
- 35
0
votes
1 answer
Unmodified column name index in patsy
I am using patsy to prepare categorical data for regression and want to map from a column name to its index in the DesignMatrix. I have tried using the column_name_indexes attribute of the DesignInfo object but the column names have been modified to…

DontDivideByZero
- 1,171
- 15
- 28
0
votes
0 answers
Fitting exact equations using statsmodels and patsy
Statsmodels allows the use of R-style formulas for equation fitting using patsy and statsmodels.formula.api. I would like to fit a specific function using columns in a pandas DataFrame, however, I can only seem to get close. For example, if I have…

David Hagan
- 1,156
- 12
- 23
0
votes
1 answer
fitting for offset in a patsy model
Using patsy, I understand how to turn intercepts on or off. But I haven't managed to get horizontal offsets. For instance, I would like to be able to fit, in essence
y = alpha + beta * abs(x_opt - x_obs)
with x_opt free in the fit. I tried write…

James Saxon
- 3
- 1
- 2
0
votes
1 answer
Removing Terms from Patsy Formulas with re
Context: Python 3.4.3
I'm not very good with regular expressions, and I can't seem to figure out a robust solution to this using re.
Suppose we have a long patsy formula and somewhere in the middle is an expression like:
... + xvar +…

chriswhite
- 1,370
- 10
- 21
0
votes
1 answer
Why is patsy returning additional columns when I add a None value?
I'm using patsy to create matrices. But I get strange behavior when None or Nan values are in the dataset. As seen below instead of just dropping the None row it creates additional columns with 1's and 0's.
import numpy as np
import pandas as…

rsgmon
- 1,892
- 4
- 23
- 35