I have historical data on users - I would like to fit an Ordinary Least Squares regression to find out the trends.
my datalooks like
user_id rating item_id date
12 3 19 2010-03-17
13 4 20 2010-03-18
1 3 123 2010-03-19
12 3.5 340 2010-03-17
19 2 19 2010-04-17
here is my function
def coef(y):
s = y.shape[0]
A = np.vstack([range(s), np.ones(s)]).T
m, c = np.linalg.lstsq(A, y, rcond=None)[0]
return(m)
I was hoping to do something like the following
mydt[:, coef(dt.f.rating), dt.by(dt.f.user_id)]
or some how run this function against each user id. Unfortunately, the data is too big that I can't use Pandas ! so really appreciate to hear even about alternatives.