Analysis¶

Modules

Provides HPO analysis tools¶

orion.analysis.average(trials, group_by='order', key='best', return_var=False)[source]¶

Compute the average of some trial attribute.

By default it will compute the average objective at each time step across multiple experiments.

Parameters

trials: DataFrame: A dataframe of trials containing, at least, the columns ‘best’ and ‘order’.
group_by: str, optional: The attribute to use to group trials for the average. By default it group trials by order (ex: all first trials across experiments.)
key: str, optional: One attribute or a list of attributes split by ‘,’ to average. Defaults to ‘best’ as returned by orion.analysis.regret.
return_var: bool, optional: If True, and a column ‘{key}_var’ where ‘{key}’ is the value of the argument key. Defaults to False.

Returns

A dataframe with columns ‘order’, ‘{key}_mean’ and ‘{key}_var’.

orion.analysis.lpi(trials, space, mode='best', model='RandomForestRegressor', n_points=20, n_runs=10, **kwargs)[source]¶

Calculates the Local Parameter Importance for a collection of orion.core.worker.trial.Trial.

For more information on the metric, see original paper at https://ml.informatik.uni-freiburg.de/papers/18-LION12-CAVE.pdf.

Biedenkapp, André, et al. “Cave: Configuration assessment, visualization and evaluation.” International Conference on Learning and Intelligent Optimization. Springer, Cham, 2018.

Parameters

trials: DataFrame or dict: A dataframe of trials containing, at least, the columns ‘objective’ and ‘id’. Or a dict equivalent.
space: Space object: A space object from an experiment.
mode: str: Mode to compute the LPI. - best: Take the best trial found as the anchor for the LPI - linear: Recompute LPI for all values on a grid
model: str: Name of the regression model to use. Can be one of - AdaBoostRegressor - BaggingRegressor - ExtraTreesRegressor - GradientBoostingRegressor - RandomForestRegressor (Default)
n_points: int: Number of points to compute the variances. Default is 20.
n_runs: int: Number of runs to compute the standard error of the LPI. Default is 10.
``**kwargs``: Arguments for the regressor model.

Returns

DataFrame: LPI value for each parameter. If mode is linear, then a list of param values and LPI metrics are returned in a DataFrame format.

orion.analysis.partial_dependency(trials, space, params=None, model='RandomForestRegressor', n_grid_points=10, n_samples=50, **kwargs)[source]¶

Calculates the partial dependency of parameters in a collection of orion.core.worker.trial.Trial.

Parameters

trials: DataFrame or dict: A dataframe of trials containing, at least, the columns ‘objective’ and ‘id’. Or a dict equivalent.
space: Space object: A space object from an experiment.
params: list of str, optional: The parameters to include in the computation. All parameters are included by default.
model: str: Name of the regression model to use. Can be one of - AdaBoostRegressor - BaggingRegressor - ExtraTreesRegressor - GradientBoostingRegressor - RandomForestRegressor (Default)
n_grid_points: int: Number of points in the grid to compute partial dependency. Default is 10.
n_samples: int: Number of samples to randomly generate the grid used to compute the partial dependency. Default is 50.
**kwargs: Arguments for the regressor model.

Returns

dict: Dictionary of DataFrames. Each combination of parameters as keys (dim1.name, dim2.name) and for each parameters individually (dim1.name). Columns are (dim1.name, dim2.name, objective) or (dim1.name, objective).

orion.analysis.ranking(trials, group_by='order', key='best')[source]¶

Compute the ranking of some trial attribute.

By default it will compute the ranking with respect to objectives at each time step across multiple experiments.

Parameters

trials: DataFrame: A dataframe of trials containing, at least, the columns ‘best’ and ‘order’.
group_by: str, optional: The attribute to use to group trials for the ranking. By default it group trials by order (ex: all first trials across experiments.)
key: str, optional: The attribute to use for the ranking. Defaults to ‘best’ as returned by orion.analysis.regret.

Returns

A copy of the original dataframe with a new column ‘rank’ for the rankings.

orion.analysis.regret(trials, names=('best', 'best_id'))[source]¶

Calculates the regret for a collection of orion.core.worker.trial.Trial. The regret is calculated sequentially from the order of the collection.

Parameters

trials: DataFrame or dict: A dataframe of trials containing, at least, the columns ‘objective’ and ‘id’. Or a dict equivalent.
names:: A tuple containing the names of the columns. Default is (‘best’, ‘best-id’).

Returns

A copy of the original dataframe with two new columns containing respectively the best value
so far and its trial id.