My question is about a notation or python operator that simulates R function for linear regression.
Let's say we have data and we want to represent the dependent variable as a function of independent variables. In R, there is a simple way of representing the independent variables. Instead of listing all the independent variables, one can use the dot (.) operation to represent all the variables. However, I don't seem to find an operation in python which quite does the same job as R. The operator colon ':' is used for interaction between two columns and asterisk '*' is used for list of columns and their interaction.
For instance, let's say an insurance has 6 independent variables.
model <- lm(expenses ~ age + children + bmi + sex + smoker + region,
data = insurance)
Instead of listing all the independent variables in the formula, one can use the dot '.' operation to refer that it includes all the variables, therefore, it can be shortened as:
model <- lm(expenses ~ .,
data = insurance)
I am looking for an operator Python that gives similar effect as in R.
import statsmodels.formula.api as smf
model = smf.ols(formula = 'expenses ~ age + children + bmi + smoker + region', data = insurance).fit()
I used import statsmodels.formula.api as smf
in python to get similar results as in R, but unfortunately, I don't seem to find something that shortens the formula.