First, a couple of remarks:
the name of the algorithm is Gradient Boosting (Regression Trees or Machines) and is not directly related to Stochastic Gradient Descent
you should never evaluate the accuracy of a machine learning algorithm on you training data, otherwise you won't be able to detect the over-fitting of the model. Use: sklearn.cross_validation.train_test_split
to split X
and y
into a X_train
, y_train
for fitting and X_test
, y_test
for scoring instead.
Now to answer your question, GBRT models are indeed non deterministic models. To get deterministic / reproducible runs, you can pass random_state=0
to seed the pseudo random number generator (or alternatively pass max_features=None
but this is not recommended).
The fact that you observe such big variations in your training error is weird though. Maybe your output signal if very correlated with a very small number of informative features and most other features are just noise?
You could try to fit a RandomForestClassifier
model to your data and use the computed feature_importance_
array to discard noisy features and help stabilize your GBRT models.