I am following an online econometrics course and learning stats models while I go.
I know from the instructor that this regression will have a better fit on a logarithmic scale, but I don't know how or where to convert my data / formula.
I am using Python, Pandas, Statsmodels and Patsy
Here is where I converted the data to dmatrices:
y, X = dmatrices('PRICE ~ QUANTITY', data=df, return_type='dataframe')
Here is where I ran the regression in statsmodels:
mod = sm.OLS(y, X) # Describe model
res = mod.fit() # Fit Model
print(res.summary()) # Summarize model
I get a very low r-squared, but the model does run. I'm just trying to figure out how to convert to a log scale. The example given on the course, he converted both the X and Y axes to log scales
EDIT: I got it to work using this:
df2['Quantity'] = np.log(df['QUANTITY'])
df2['Price'] = np.log(df['PRICE'])
Is there a way to get that done in 1 line of code, or even a loop if I needed to do it to a few more variables in another problem?