TPE Algorithm¶

TPE hyperparameter optimizer

class orion.algo.tpe.CategoricalSampler(tpe: TPE, observations: Sequence | numpy.ndarray, choices: Sequence | numpy.ndarray)[source]¶

Categorical Sampler for discrete integer and categorical choices

Parameters

tpe: `TPE` algorithm: The tpe algorithm object which this sampler will be part of.
observations: list: Observed values in the dimension
choices: list: Candidate values for the dimension

Methods

`get_loglikelis`(points)	Return the log likelihood for the points
`sample`([num])	Sample required number of points

get_loglikelis(points: numpy.ndarray | Sequence[numpy.ndarray]) → numpy.ndarray[source]¶: Return the log likelihood for the points

sample(num: int = 1) → ndarray[source]¶: Sample required number of points

class orion.algo.tpe.GMMSampler(tpe: TPE, mus: Sequence[float] | numpy.ndarray, sigmas: Sequence[float] | numpy.ndarray, low: float, high: float, weights: Sequence[float] | numpy.ndarray | None = None, base_attempts: int = 10, attempts_factor: int = 10, max_attempts: int = 10000)[source]¶

Gaussian Mixture Model Sampler for TPE algorithm

Parameters

tpe: `TPE` algorithm: The tpe algorithm object which this sampler will be part of.
mus: list: mus for each Gaussian components in the GMM. Default: None
sigmas: list: sigmas for each Gaussian components in the GMM.
low: real: Lower bound of the sampled points.
high: real: Upper bound of the sampled points.
weights: list: Weights for each Gaussian components in the GMM Default: None
base_attempts: int, optional: Base number of attempts to sample points within low and high bounds. Defaults to 10.
attempts_factor: int, optional: If sampling always falls out of bound try again with attempts * attempts_factor. Defaults to 10.
max_attempts: int, optional: If sampling always falls out of bound try again with attempts * attempts_factor up to max_attempts (inclusive). Defaults to 10000.

Methods

`get_loglikelis`(points)	Return the log likelihood for the points
`sample`([num, attempts])	Sample required number of points

get_loglikelis(points: numpy.ndarray | list[numpy.ndarray] | Sequence[Sequence[float]]) → numpy.ndarray[source]¶: Return the log likelihood for the points

sample(num: int = 1, attempts: int | None = None) → list[numpy.ndarray][source]¶: Sample required number of points

class orion.algo.tpe.TPE(space: Space, seed: int | Sequence[int] | None = None, n_initial_points: int = 20, n_ei_candidates: int = 24, gamma: float = 0.25, equal_weight: bool = False, prior_weight: float = 1.0, full_weight_num: int = 25, max_retry: int = 100, parallel_strategy: dict | None = None)[source]¶

Tree-structured Parzen Estimator (TPE) algorithm is one of Sequential Model-Based Global Optimization (SMBO) algorithms, which will build models to propose new points based on the historical observed trials.

Instead of modeling p(y|x) like other SMBO algorithms, TPE models p(x|y) and p(y), and p(x|y) is modeled by transforming that generative process, replacing the distributions of the configuration prior with non-parametric densities.

The TPE defines p(x|y) using two such densities l(x) and g(x) while l(x) is distribution of good points and g(x) is the distribution of bad points. New point candidates will be sampled with l(x) and Expected Improvement (EI) optimization scheme will be used to find the most promising point among the candidates.

For more information on the algorithm, see original papers at:

Parameters

space: `orion.algo.space.Space`: Optimisation space with priors for each dimension.
seed: None, int or sequence of int, optional: Seed to sample initial points and candidates points. Default: None
n_initial_points: int, optional: Number of initial points randomly sampled. If new points are requested and less than n_initial_points are observed, the next points will also be sampled randomly instead of being sampled from the parzen estimators. Default: 20
n_ei_candidates: int, optional: Number of candidates points sampled for ei compute. Larger numbers will lead to more exploitation and lower numbers will lead to more exploration. Be careful with categorical dimension as TPE tend to severily exploit these if n_ei_candidates is larger than 1. Default: 24
gamma: real, optional: Ratio to split the observed trials into good and bad distributions. Lower numbers will load to more exploitation and larger numbers will lead to more exploration. Default: 0.25
equal_weight: bool, optional: True to set equal weights for observed points. Default: False
prior_weight: int, optional: The weight given to the prior point of the input space. Default: 1.0
full_weight_num: int, optional: The number of the most recent trials which get the full weight where the others will be applied with a linear ramp from 0 to 1.0. It will only take effect if equal_weight is False.
max_retry: int, optional: Number of attempts to sample new points if the sampled points were already suggested. Default: 100
parallel_strategy: dict or None, optional: The configuration of a parallel strategy to use for pending trials or broken trials. Default is a MaxParallelStrategy for broken trials and NoParallelStrategy for pending trials.

Attributes

requires_type
state_dict: Return a state dict that can be used to reset the state of the algorithm.

Methods

`seed_rng`(seed)	Seed the state of the random number generator.
`set_state`(state_dict)	Reset the state of the algorithm based on the given state_dict
`split_trials`()	Split the observed trials into good and bad ones based on the ratio gamma`
`suggest`(num)	Suggest a num of new sets of parameters.

seed_rng(seed: int | Sequence[int] | None) → None[source]¶

Seed the state of the random number generator.

Parameters: seed – Integer seed for the random number generator.

set_state(state_dict: dict) → None[source]¶

Reset the state of the algorithm based on the given state_dict

Parameters: state_dict – Dictionary representing state of an algorithm

split_trials() → tuple[list[Trial], list[Trial]][source]¶: Split the observed trials into good and bad ones based on the ratio gamma`

property state_dict: dict¶: Return a state dict that can be used to reset the state of the algorithm.

suggest(num: int) → list[Trial] | None[source]¶

Suggest a num of new sets of parameters. Randomly draw samples from the import space and return them.

Parameters

num: int: Number of trials to sample.
.. note:: New parameters must be compliant with the problem’s domain: orion.algo.space.Space.

orion.algo.tpe.adaptive_parzen_estimator(mus: numpy.ndarray | Sequence, low: float, high: float, prior_weight: float = 1.0, equal_weight: bool = False, flat_num: int = 25) → tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray][source]¶

Return the sorted mus, the corresponding sigmas and weights with adaptive kernel estimator.

This adaptive parzen window estimator is based on the original papers and also refer the use of prior mean in this implementation.

Parameters

mus – list of real values for observed mus.
low – real value for lower bound of points.
high – real value for upper bound of points.
prior_weight – real value for the weight of the prior mean.
equal_weight – bool value indicating if all points with equal weights.
flat_num – int value indicating the number of the most recent trials which get the full weight where the others will be applied with a linear ramp from 0 to 1.0. It will only take effect if equal_weight is False.

orion.algo.tpe.compute_max_ei_point(points: numpy.ndarray | Sequence[numpy.ndarray], below_likelis: Sequence[float] | numpy.ndarray, above_likelis: Sequence[float] | numpy.ndarray) → numpy.ndarray[source]¶

Compute ei among points based on their log likelihood and return the point with max ei.

Parameters

points – list of point with real values.
below_likelis – list of log likelihood for each point in the good GMM.
above_likelis – list of log likelihood for each point in the bad GMM.

orion.algo.tpe.ramp_up_weights(total_num: int, flat_num: int, equal_weight: bool) → ndarray[source]¶

Adjust weights of observed trials.

Parameters

total_num – total number of observed trials.
flat_num – the number of the most recent trials which get the full weight where the others will be applied with a linear ramp from 0 to 1.0. It will only take effect if equal_weight is False.
equal_weight – whether all the observed trails share the same weights.