R - glmer different results on different machines (non-deterministic)

Question

Is there any reason why the glmer function from lme4 would produce different results on different machines? The hardware in the machines are substantially different, though are all running the same OS, R and package versions (turns out this is not actually true).

The formula has a grouped binomial response variable and 22 continuous fixed effects which are all on the same scale and several random effects, which are strings and I am using the logit link function.

cbind(ill, not_ill) ~ 0 + fix1 + fix2 + ... + fix22 + (1|id/region/country) +
(1|season)

When using a train and test data set for leave one out cross validation, I get very similar results. However on one machine I get consistently clean output with no warnings; on another I get convergence warnings on every fold of the test.

N.B. The train/test sets are identical across machines

EDIT: adding sessionInfo()

Machine 1 (this is the one that puts out nice results

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] blmeco_1.1     arm_1.9-1      MASS_7.3-45    lme4_1.1-12    Matrix_1.2-7.1

loaded via a namespace (and not attached):
 [1] minqa_1.2.4     coda_0.18-1     abind_1.4-5     Rcpp_0.12.7
 [5] MuMIn_1.15.6    splines_3.3.1   nlme_3.1-128    grid_3.3.1
 [9] nloptr_1.0.4    stats4_3.3.1    lattice_0.20-34

Machine 2 (Not so nice results)

R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] blmeco_1.1   arm_1.9-1    MASS_7.3-45  lme4_1.1-12  Matrix_1.2-3

loaded via a namespace (and not attached):
 [1] minqa_1.2.4     coda_0.18-1     abind_1.4-5     Rcpp_0.12.7
 [5] MuMIn_1.15.6    splines_3.2.3   nlme_3.1-124    grid_3.2.3
 [9] nloptr_1.0.4    stats4_3.2.3    lattice_0.20-33

Obviously there's a few differences here that I missed so I will rectify that and see if there are any changes in output. Of the differences that exist, Matrix is the one most likely to be causing an issue as (I think) it's a dependency of lme4. Thanks for comments that led me here.

Please provide a reproducible example http://stackoverflow.com/help/mcve Having said that, you set the same seed value on both computers, right? — Hack-R, Oct 30 '16 at 00:28
This is the most information I can provide due to confidentiality reasons. I am not looking for solutions to my problem, I am only trying to understand/find out if there are any non-deterministic components to what I have described that I am not seeing. I don't think this really warrants the down vote. — JakeCowton, Oct 30 '16 at 00:34
That's really not how it works. Everyone who has a job has confidential data which is 99.9% of people here. The burden is on you to take the time and effort to make a **reproducible example** using public, self-created, or blinded data, such that it reflects your problem without needing to be your actual data. I'm voting to close this question as off-topic for not containing a reproducible example. — Hack-R, Oct 30 '16 at 01:10
@RockJake28 ; it would be helpful if you could edit your question the results of `sessionInfo()` from each of the computers you are using. Do you get differing results when fitting the raw models or just when using cross-validation: have you used `set.seed` as Hack-R suggesed — user20650, Oct 30 '16 at 01:34
@Hack-R I appreciate what you're saying, but I'm not looking for someone to solve my problem, which would of course require a reproducible example. My question is "Is there any reason why the glmer function from lme4 would produce different results on different machines?", not how to fix it or anything like that. I'm looking to find out if there are, for example, parts of `lme4` which uses values which would be affected by different hardware. @user20650 A full run takes a couple hours, I've added the `set.seed`value and will report back with the results and I will update with `sessionInfo()` — JakeCowton, Oct 30 '16 at 02:02

score 3 · Accepted Answer · answered Oct 30 '16 at 03:17

I'm not sure what you mean by "non-deterministic" here; I would usually take that to mean that successive runs of the same code, on the same machine, could give different results.

For large, unstable problems it would be mildly surprising, but not impossible, to get different results on different hardware platforms under the same operating system. We certainly see cases where the same version of the package (same R and C++ code) gives different results when compiled with different compilers under different operating systems. If those differences are on either side of a tolerance test, then you will get warnings in one case and not in the other. I would be more concerned by how far apart the estimates are on different platforms than in whether you get warnings or not.

It would certainly narrow things down to make sure you were doing everything as similarly as possible (e.g. you are still using different versions of R and, as you pointed out, different versions of Matrix, on the different machines ...)

"the same version of the package (same R and C++ code) gives different results when compiled with different compilers" this is what I was looking for. Thanks. — JakeCowton, Oct 30 '16 at 14:16

R - glmer different results on different machines (non-deterministic)

1 Answers1