3

I'm trying to create a deterministic decision tree in python, but I've some issue.

If I run 2 times my script I get to different Decision Trees, because of the random_state of the algorithm.

I tried to fixed random_state( random_state=0) but It's still not working.

I'd like to remove the randomness of my decision tree but I can t find a clear solution.

Rahul Agarwal
  • 4,034
  • 7
  • 27
  • 51
Corentin Moreau
  • 111
  • 1
  • 1
  • 12

2 Answers2

3

Sklearn uses the random seed from numpy, so you can set the global seed at the start of your script with:

import numpy as np

np.random.seed(0)
Stradivari
  • 2,626
  • 1
  • 9
  • 21
3

The random_state argument should work but here are 2 different options

Option 1:

from sklearn.ensemble import RandomForestRegressor

rf = RandomForestRegressor(n_estimators=1000, criterion='mse', min_samples_leaf=4, 
                           random_state= 0)

This should return the same results every single time.


Scikit-learn does not use its own global random state; whenever a RandomState instance or an integer random seed is not provided as an argument, it relies on the numpy global random state, which can be set using numpy.random.seed


Option 2:

That being said, adding np.random.seed() before importing the RandomForestRegressor should also do the trick.

import numpy as np
np.random.seed(0)

Source: http://scikit-learn.org/stable/faq.html#how-do-i-set-a-random-state-for-an-entire-execution

seralouk
  • 30,938
  • 9
  • 118
  • 133