Pandas gives different results for different machines

Question

When I upgraded from numpy 1.9 to 1.10 I started to see that the following regression model gives different results on different machines with the same hardware configuration:

fitted_model = pd.ols(y=lhs_unpickled, x=rhs_unpickled, intercept=False)
print fitted_model.beta

lhs_unpickled and rhs_unpickled look like this:

> lhs_unpickled[1:5]
2008-04-24 00:18:00+00:00   -0.465517
2008-04-24 00:33:00+00:00   -0.519584
2008-04-24 00:48:00+00:00   -0.607410
2008-04-24 01:03:00+00:00   -0.705983
Freq: 15T, Name: AI_Index, dtype: float64

> rhs_unpickled[1:5]
                                CPM       XQH       FOD        EX
2008-04-24 00:18:00+00:00 -0.301556  0.148582  0.079320 -0.707586
2008-04-24 00:33:00+00:00 -0.274421  0.071747  0.130182 -0.659409
2008-04-24 00:48:00+00:00 -0.273960 -0.001447  0.148643 -0.703215
2008-04-24 01:03:00+00:00 -0.238426 -0.008732  0.130801 -0.698489

Is there something specific about this pd.ols() function which results in this inconsistent behavior when using numpy 1.10?

Are you sure `y` and `x` should have the same value as you wrote (i.e. `lhs_unpickled`) ? Because then `beta=1` no matter what — rafaelc, Jun 01 '18 at 21:57
Hm, so your typo was only here in this question? Im thinking you probably have a typo in your codes, or a difference in data. Results should be the same.. Also, note that `pd.ols` has been deprecated, don't know if theres some incompatibility in there — rafaelc, Jun 01 '18 at 22:59
@RafaelC yes, the typo is only in this question. What should I use instead of pd.ols instead? thank you. — Darth.Vader, Jun 01 '18 at 23:54
This is very broad. There is no reproducable example and many things are missing (OS, numpy-setup and so on; probably not relevant, but even BLAS alone could induce indeterminism). Additionally it's unclear to me what exactly happens. Are those results deterministic? Are they not? Did statsmodels-version change too (should be the underlying code in use)? — sascha, Jun 02 '18 at 00:37

Pandas gives different results for different machines

0 Answers0