I'd like to use sentiment scores to predict each of the stock's return (stock1, stock2, and stock 3). Please see the sample dataset below.
data={"sentiment":[0.9, 0.75, 0.88, 0.23] , "stock1":[0.0015, 0.034, -0.065, 0.015], "stock2":[0.023, -0.001, 0.0098, 0.072], "stock3":[-0.0052, 0.0083, 0.012, 0.094]}
sample=pd.DataFrame(data, columns=['sentiment', 'stock1', 'stock2', 'stock3'])
print(sample)
instead of running regression 3 times, I'd like to use for loop to iterate over the 3 different stock returns, here my try:
diff_stock=['stock1','stock2','stock3']
for i in diff_stock:
model=glm(formula='i ~ sentiment', data=sample, family=sm.families.Gaussian()).fit()
print(model.summary())
However, I keep getting this error message:
PatsyError: Number of rows mismatch between data argument and i (3377 versus 1) i ~ favorite_count
It seems like there's only 1 value in i (the stock column), but I don't understand why...