Same seed, PL/R vs. R, different results (Random Forest)

Question

I have an R function that takes some input data that contains missing values, uses Random Forest imputation to impute those values (through the rfImpute function from RandomForest package) and then goes through a RF importance calculation to identify the relative importance of variables (through ranger from the ranger package). The function has the seed 2018.

When I run the function using R with set.seed(2018), I get a set of results. When running the exact same function, the exact same input data and using the exact same seed in PL/R (using Navicat) the results are different.

I am having a really hard time understanding what could be causing this issue as everything is the exact same between the two (except one is R and the other is PL/R). For some input datasets, the results are equivalent but for others they are not. What could the problem be?

Note: I am not able to provide a simple example since my data is confidential.

do the results within R and PL/R stay the same with each run? We had some changing RF-results with each run despite having set the same seed in each run. It turned out, that the order of the Input-Data (which came from an external source) varied. The split in train and test data was based on a random selection of indices from 1 to NROW(data)... So maybe you are having something similar like this? — TinglTanglBob, Sep 18 '18 at 14:53
@TinglTanglBob the results they the same with each run... I am ordering the data once it gets inputted so I don't think that's the problem? — Grint, Sep 18 '18 at 15:01
Do you get the same random numbers when you call `set.seed(2018); runif(10)` within PL/R and local R? — Ralf Stubner, Sep 18 '18 at 16:05
When retrieving the data set, is the order of the records fixed (and the same) — joop, Sep 18 '18 at 17:20
@RalfStubner yep, I get the same numbers... so confused by this — Grint, Sep 19 '18 at 15:21
@joop I am ordering the data inside the R Script, so not sure how that could be the cause? — Grint, Sep 19 '18 at 15:21
In that case you will have to provide a minimal example. Instead of your confidential data you can produce a similar random dataset. — Ralf Stubner, Sep 19 '18 at 18:51

Same seed, PL/R vs. R, different results (Random Forest)

0 Answers0