In R, it is possible to execute multiple linear regression like
temp = lm(log(volume_1[11:62])~log(price_1[11:62])+log(volume_1[10:61]))
In Python, it is possible to execute multiple linear regression with R style formula so I thought the code below should work just as well,
import statsmodels.formula.api as smf
import pandas as pd
import numpy as np
rando = lambda x: np.random.randint(low=1, high=100, size=x)
df = pd.DataFrame(data={'volume_1': rando(62), 'price_1': rando(62)})
temp = smf.ols(formula='np.log(volume_1)[11:62] ~ np.log(price_1)[11:62] + np.log(volume_1)[10:61]',
data=df)
# np.log(volume_1)[10:61] express the lagged volume
but I get the error
PatsyError: Number of rows mismatch between data argument and volume_1[11:62] (62 versus 51)
volume_1[11:62] ~ price_1[11:62] + volume_1[10:61]
I guess it is not possible to regress just part of the rows in columns, cuz the data = df has 62 rows, and the other variables have 51 rows.
Is there any convenient way to do regression like R?
df type is pandas Dataframe and the column names are volume_1, price_1