Partial Dependencies

Hint

Conveys a broad overview of the search space and what has been explored during the experiment. Helps identifying best optimal regions of the space.

The partial dependency computes the average predicted performance with respect to a set of hyperparameters, marginalizing out the other hyperparameters.

To predict the performance on unseen set of hyperparameters, we train a regression model on available trial history. We build a grid of g points for a hyperparameter of interest, or a 2-D grid of g ^ 2 points for a pair of hyperparameters. We sample a group of n set of hyperparameters from the entire space to marginalize over the other hyperparameters. For each value of the grid, we compute the prediction of the regression model on all points of the group, with the hyperparameter of interest set to a value of the grid. For instance, for a 1-D grid of g points and a group of n points, we compute g * n predictions.

For a search space of d hyperparameters, the partial dependency plot is organized as a matrix of (d, d) subplots. The subplots on the diagonal show the partial dependency of each hyperparameters separately, while the subplots below the diagonal show the partial dependency between two hyperparameters. Let’s look at a simple example to make it more concrete.

orion.plotting.base.partial_dependencies(experiment, with_evc_tree=True, params=None, smoothing=0.85, verbose_hover=True, n_grid_points=10, n_samples=50, colorscale='Blues', model='RandomForestRegressor', model_kwargs=None)[source]

Make countour plots to visualize the search space of each combination of params.

Parameters
experiment: ExperimentClient or Experiment

The orion object containing the experiment data

with_evc_tree: bool, optional

Fetch all trials from the EVC tree. Default: True

params: list of str, optional

Indicates the parameters to include in the plots. All parameters are included by default.

smoothing: float, optional

Smoothing applied to the countor plot. 0 corresponds to no smoothing. Default is 0.85.

verbose_hover: bool

Indicates whether to display the hyperparameter in hover tooltips. True by default.

colorscale: str, optional

The colorscale used for the contour plots. Supported values depends on the backend. Default is ‘Blues’.

n_grid_points: int, optional

Number of points in the grid to compute partial dependency. Default is 10.

n_samples: int, optinal

Number of samples to randomly generate the grid used to compute the partial dependency. Default is 50.

model: str

Name of the regression model to use. Can be one of - AdaBoostRegressor - BaggingRegressor - ExtraTreesRegressor - GradientBoostingRegressor - RandomForestRegressor (Default)

model_kwargs: dict, optional

Arguments for the regressor model.

Returns
plotly.graph_objects.Figure
Raises
ValueError

If no experiment is provided.

The partial dependencies plot can be executed directly from the experiment with plot.partial_dependencies() as shown in the example below.

from orion.client import get_experiment

# Specify the database where the experiments are stored. We use a local PickleDB here.
storage = dict(type="legacy", database=dict(type="pickleddb", host="../db.pkl"))

# Load the data for the specified experiment
experiment = get_experiment("2-dim-exp", storage=storage)
fig = experiment.plot.partial_dependencies()
fig

Out:

/home/docs/checkouts/readthedocs.org/user_builds/orion/checkouts/v0.1.16/src/orion/analysis/base.py:173: VisibleDeprecationWarning:

Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.


For the plots on the diagonal, the y-axis is the objective and the x-axis is the value of the corresponding hyperparameter. For the contour plots below the diagonal, the y-axis and x-axis are the values of the corresponding hyperparameters labelled on the left and at the bottom. The objective is represented as a color gradient in the contour plots. The light blue area in the plots on the diagonal represents the standard deviation of the predicted objective when varying the other hyperparameters over the search space. The black dots represents the trials in the current history of the experiment. If you hover your cursor over one dot, you will see the configuration of the corresponding trial following this format:

ID: <trial id>
value: <objective>
time: <completed time>
parameters
  <name>: <value>

Even for a simple 2-d search space, the partial dependency is very useful. We see very cleary in this example the optimal regions for both hyperparameters and we can see as well that the optimal region for learning rates is larger when the dropout is low, and narrower when dropout approaches 0.5.

Todo

Make one toy example where two HPs are dependent.

Options

Params

The simple example involved only 2 hyperparameters, but typical search spaces can be much larger. The partial dependency plot becomes hard to read with more than 3-5 hyperparameters dependency on the size of your screen. With a fix width like in this documentation, 5 hyperparameters are impossible to read as you can see below. (Data coming from tutorial Checkpointing trials)

experiment = get_experiment("hyperband-cifar10", storage=storage)
experiment.plot.partial_dependencies()


You can select the hyperparameters to show with the argument params.

experiment.plot.partial_dependencies(params=["gamma", "learning_rate"])


Grid resolution

The grid used for the partial dependency can be more or less coarse. Coarser grids will be faster to compute.

import time

experiment = get_experiment("2-dim-exp", storage=storage)
start = time.clock()
fig = experiment.plot.partial_dependencies(n_grid_points=5)
print(time.clock() - start, "seconds to compute")
fig

Out:

/home/docs/checkouts/readthedocs.org/user_builds/orion/checkouts/v0.1.16/examples/plotting/plot_4_partial_dependencies.py:102: DeprecationWarning:

time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead

/home/docs/checkouts/readthedocs.org/user_builds/orion/checkouts/v0.1.16/src/orion/analysis/base.py:173: VisibleDeprecationWarning:

Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.

/home/docs/checkouts/readthedocs.org/user_builds/orion/checkouts/v0.1.16/examples/plotting/plot_4_partial_dependencies.py:104: DeprecationWarning:

time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead

0.5145319999999991 seconds to compute