Analysis¶
Provides HPO analysis tools¶
- orion.analysis.average(trials, group_by='order', key='best', return_var=False)[source]¶
Compute the average of some trial attribute.
By default it will compute the average objective at each time step across multiple experiments.
- Parameters
- trials: DataFrame
A dataframe of trials containing, at least, the columns ‘best’ and ‘order’.
- group_by: str, optional
The attribute to use to group trials for the average. By default it group trials by order (ex: all first trials across experiments.)
- key: str, optional
One attribute or a list of attributes split by ‘,’ to average. Defaults to ‘best’ as returned by
orion.analysis.regret
.- return_var: bool, optional
If True, and a column ‘{key}_var’ where ‘{key}’ is the value of the argument key. Defaults to False.
- Returns
- A dataframe with columns ‘order’, ‘{key}_mean’ and ‘{key}_var’.
- orion.analysis.lpi(trials, space, mode='best', model='RandomForestRegressor', n_points=20, n_runs=10, **kwargs)[source]¶
Calculates the Local Parameter Importance for a collection of
orion.core.worker.trial.Trial
.For more information on the metric, see original paper at https://ml.informatik.uni-freiburg.de/papers/18-LION12-CAVE.pdf.
Biedenkapp, André, et al. “Cave: Configuration assessment, visualization and evaluation.” International Conference on Learning and Intelligent Optimization. Springer, Cham, 2018.
- Parameters
- trials: DataFrame or dict
A dataframe of trials containing, at least, the columns ‘objective’ and ‘id’. Or a dict equivalent.
- space: Space object
A space object from an experiment.
- mode: str
Mode to compute the LPI. -
best
: Take the best trial found as the anchor for the LPI -linear
: Recompute LPI for all values on a grid- model: str
Name of the regression model to use. Can be one of - AdaBoostRegressor - BaggingRegressor - ExtraTreesRegressor - GradientBoostingRegressor - RandomForestRegressor (Default)
- n_points: int
Number of points to compute the variances. Default is 20.
- n_runs: int
Number of runs to compute the standard error of the LPI. Default is 10.
- ``**kwargs``
Arguments for the regressor model.
- Returns
- DataFrame
LPI value for each parameter. If
mode
is linear, then a list of param values and LPI metrics are returned in a DataFrame format.
- orion.analysis.partial_dependency(trials, space, params=None, model='RandomForestRegressor', n_grid_points=10, n_samples=50, **kwargs)[source]¶
Calculates the partial dependency of parameters in a collection of
orion.core.worker.trial.Trial
.- Parameters
- trials: DataFrame or dict
A dataframe of trials containing, at least, the columns ‘objective’ and ‘id’. Or a dict equivalent.
- space: Space object
A space object from an experiment.
- params: list of str, optional
The parameters to include in the computation. All parameters are included by default.
- model: str
Name of the regression model to use. Can be one of - AdaBoostRegressor - BaggingRegressor - ExtraTreesRegressor - GradientBoostingRegressor - RandomForestRegressor (Default)
- n_grid_points: int
Number of points in the grid to compute partial dependency. Default is 10.
- n_samples: int
Number of samples to randomly generate the grid used to compute the partial dependency. Default is 50.
- **kwargs
Arguments for the regressor model.
- Returns
- dict
Dictionary of DataFrames. Each combination of parameters as keys (dim1.name, dim2.name) and for each parameters individually (dim1.name). Columns are (dim1.name, dim2.name, objective) or (dim1.name, objective).
- orion.analysis.ranking(trials, group_by='order', key='best')[source]¶
Compute the ranking of some trial attribute.
By default it will compute the ranking with respect to objectives at each time step across multiple experiments.
- Parameters
- trials: DataFrame
A dataframe of trials containing, at least, the columns ‘best’ and ‘order’.
- group_by: str, optional
The attribute to use to group trials for the ranking. By default it group trials by order (ex: all first trials across experiments.)
- key: str, optional
The attribute to use for the ranking. Defaults to ‘best’ as returned by
orion.analysis.regret
.
- Returns
- A copy of the original dataframe with a new column ‘rank’ for the rankings.
- orion.analysis.regret(trials, names=('best', 'best_id'))[source]¶
Calculates the regret for a collection of
orion.core.worker.trial.Trial
. The regret is calculated sequentially from the order of the collection.- Parameters
- trials: DataFrame or dict
A dataframe of trials containing, at least, the columns ‘objective’ and ‘id’. Or a dict equivalent.
- names:
A tuple containing the names of the columns. Default is (‘best’, ‘best-id’).
- Returns
- A copy of the original dataframe with two new columns containing respectively the best value
- so far and its trial id.