Questions tagged [patsy]

A Python library for describing statistical models and building design matrices, aimed at bringing the convenience of R “formulas” to Python.

113 questions
0
votes
2 answers

Moving a bunch of distinct items to the end of a python list

I have this python list: ['Intercept', 'a', 'country[T.BE]', 'country[T.CY]', 'country[T.DE]', 'b', 'c', 'd', 'e'] I want the country items at the end: ['Intercept', 'a', 'b', 'c', 'd', 'e', 'country[T.BE]', 'country[T.CY]', 'country[T.DE]'] How to…
Martien Lubberink
  • 2,614
  • 1
  • 19
  • 31
0
votes
1 answer

patsy dmatrices raising "AssertionError"

Noob trying my first Negative Binomial Regression. iPython on Google's Colab. I load the dataset as a pandas df. The features (and Target) in the formula below all appear in the df (which I named "dataset"). I also bring in from patsy import…
RandomForestRanger
  • 257
  • 1
  • 5
  • 16
0
votes
1 answer

How to write multivariate formula in python (patsy), does VAR support it?

I want to do multivariate data analysis using vector auto regression (VAR), but want more freedom. For example, the question I am dealing with can look like: y1(t) = a11*y1(t-1) + a12*y1(t-2) + b11*y2(t-1) + c11*x1(t) + c12*x2(t) +…
user2355104
  • 82
  • 1
  • 9
0
votes
3 answers

PatsyError: Error evaluating factor: NameError:

I am an absolute newbie in Python programming and currently learning basic statistics on it. I am facing a "PatsyError: Error evaluating factor: NameError:" on a code with pred = model.predict(pd.DataFrame(calo['wt']) Below is my code: #…
Sanjeev Raikar
  • 9
  • 1
  • 1
  • 5
0
votes
1 answer

Why Name of dummy variables with/without T?

Using patsy, I noticed that it named dummy variables sometimes with T and without T in other cases. And today I realised that T is attached when the constant term is present in a regression equation, and no T without the constant term. For example,…
T_T
  • 1,202
  • 16
  • 22
0
votes
1 answer

R - Using patsy.dmatrices() with reticulate

I have a problem of namespace when trying to use function patsy.dmatrices() with the reticulate R package. Here is a simple reproducible example: patsy <- import("patsy") # Data dataset <- data.frame(Y=rnorm(1000,2.5,1)) # Null model formula_null <-…
0
votes
2 answers

Create a custom function in Patsy

import patsy from patsy import dmatrices, dmatrix, demo_data dt=pd.DataFrame({'F1':['a','b','c','d','e','a'],'F2':['X','X','Y','Y','Z','Z']}) I know I can do this dmatrix("1+I(F1=='a')",dt) but can I create a arbitrary function patsy? I'm trying…
xappppp
  • 481
  • 7
  • 18
0
votes
1 answer

convert cost function to statsmodels formula

I want to fit some data to a curve, using this as a cost function: def cost_func(x): return ((unknown_conc-x[1]*(x[0]*conc_A+ (1-x[0])*conc_B))**2).sum() It works when using scipy.optimize, but I want to use statsmodels instead.…
phenix
  • 3
  • 2
0
votes
1 answer

How to modify a liner regression in python 3.6?

The code looks like: import statsmodels.formula.api as smf df = pd.read_csv('reg_data.csv') f = 'inf ~ rh*temp*tl*Tt*C(location)' lm = smf.ols(formula = f, data=df).fit() But it always gives me an error: numbers besides '0' and '1' are only…
Jenny
  • 169
  • 1
  • 3
  • 10
0
votes
1 answer

How to stop patsy from creating redundant interactions of categorical variables

I'm using patsy to fit regressions with statsmodels using the formula api. My problem is that my design matrix is singular because patsy creates (locally?) redundant interactions of categoricals. import patsy import pandas as pd data =…
Artturi Björk
  • 3,643
  • 6
  • 27
  • 35
0
votes
1 answer

Unmodified column name index in patsy

I am using patsy to prepare categorical data for regression and want to map from a column name to its index in the DesignMatrix. I have tried using the column_name_indexes attribute of the DesignInfo object but the column names have been modified to…
DontDivideByZero
  • 1,171
  • 15
  • 28
0
votes
0 answers

Fitting exact equations using statsmodels and patsy

Statsmodels allows the use of R-style formulas for equation fitting using patsy and statsmodels.formula.api. I would like to fit a specific function using columns in a pandas DataFrame, however, I can only seem to get close. For example, if I have…
David Hagan
  • 1,156
  • 12
  • 23
0
votes
1 answer

fitting for offset in a patsy model

Using patsy, I understand how to turn intercepts on or off. But I haven't managed to get horizontal offsets. For instance, I would like to be able to fit, in essence y = alpha + beta * abs(x_opt - x_obs) with x_opt free in the fit. I tried write…
James Saxon
  • 3
  • 1
  • 2
0
votes
1 answer

Removing Terms from Patsy Formulas with re

Context: Python 3.4.3 I'm not very good with regular expressions, and I can't seem to figure out a robust solution to this using re. Suppose we have a long patsy formula and somewhere in the middle is an expression like: ... + xvar +…
chriswhite
  • 1,370
  • 10
  • 21
0
votes
1 answer

Why is patsy returning additional columns when I add a None value?

I'm using patsy to create matrices. But I get strange behavior when None or Nan values are in the dataset. As seen below instead of just dropping the None row it creates additional columns with 1's and 0's. import numpy as np import pandas as…
rsgmon
  • 1,892
  • 4
  • 23
  • 35