5

I wanted to draw partial-dependency-plot of some of the input variable with the target value. Using sklearn, I trained a gradient boosting model and then with the obtained model, I ran sklearn.inspection.plot_partial_dependence. But, I get ValueError: percentiles are too close to each other, unable to build the grid. Please choose percentiles that are further apart error. Any idea how to fix that?

Here is my code:

    columns = ['zip', 't-s', 'r', 'f', 'm-t', 'ir', 'if', 'n-d-m', 't-n-d', 'a-d-f-l-d-t', 'a-d-f-l-a-t', 'a']

print("Training GradientBoostingRegressor...")
est = HistGradientBoostingRegressor()
est.fit(inputsTrain, outputsTrain)
print("Test R2 score: {:.2f}".format(est.score(inputsTest, outputsTest)))

print('Computing partial dependence plots...')
features = columns + [('zip', 'r')]
plot_partial_dependence(est, inputsTrain, features,
                        n_jobs=3, grid_resolution=20)
fig = plt.gcf()
fig.subplots_adjust(wspace=0.4, hspace=0.3)

I got the following error:

joblib.externals.loky.process_executor._RemoteTraceback: 
Traceback (most recent call last):
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
    r = call_item()
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 272, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 608, in __call__
    return self.func(*args, **kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 256, in __call__
    for func, args, kwargs in self.items]
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 256, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/inspection/_partial_dependence.py", line 404, in partial_dependence
    grid_resolution
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/inspection/_partial_dependence.py", line 94, in _grid_from_X
    'percentiles are too close to each other, '
ValueError: percentiles are too close to each other, unable to build the grid. Please choose percentiles that are further apart.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 2141, in <module>
    main()
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 2132, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 1441, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "/home/anac/pycharm-2020.2/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "home/anac/pycharm-2020.2/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/anac/project/train_env.py", line 514, in <module>
    gradient_boosting_pdp(inputsTrain, outputsTrain, inputsValid, outputsValid, inputsTest, outputsTest)
  File "home/anac/project/utilities.py", line 271, in gradient_boosting_pdp
    n_jobs=3, grid_resolution=20)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/sklearn/inspection/_plot/partial_dependence.py", line 286, in plot_partial_dependence
    for fxs in features)
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 1017, in __call__
    self.retrieve()
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 909, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/anac/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 562, in wrap_future_result
    return future.result(timeout=timeout)
  File "/home/anac/anaconda3/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/home/anac/anaconda3/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
ValueError: percentiles are too close to each other, unable to build the grid. Please choose percentiles that are further apart.
Afshin Oroojlooy
  • 1,326
  • 3
  • 21
  • 43

1 Answers1

4

It's late but I'll answer it anyway.

You need to add a percentiles parameter to plot_partial_dependence (or partial_dependence). The default is (0.05, 0.95) and in case this exception is raised you need to replace it with (a, b) such that a and b are further apart than 0.05 and 0.95. The furthest you can go is (0, 1).

In your case, you can write for example:

plot_partial_dependence(est, inputsTrain, features, n_jobs=3, grid_resolution=20, percentiles=(0, 1))
yakir0
  • 184
  • 6