Utilities for training the Profet meta-model¶
Options and utilities for training the profet meta-model from Emukit.
- class orion.benchmark.task.profet.model_utils.MetaModelConfig(benchmark: str, task_id: int = 0, seed: int = 123, num_burnin_steps: int = 50000, num_steps: int = 13000, mcmc_thining: int = 100, lr: float = 0.01, batch_size: int = 5, max_samples: Optional[int] = None, n_inducing_lvm: int = 50, max_iters: int = 10000, n_samples_task: int = 500)[source]¶
Configuration options for the training of the Profet meta-model.
- Attributes
- max_samples
Methods
get_task_network
(input_path)Create, train and return a surrogate model for the given benchmark,
seed
andtask_id
.load_data
(input_path)Load the profet data for the given benchmark from the input directory.
load_task_network
(checkpoint_file)Load the result of the
get_task_network
function stored in the pickle file.normalize_Y
(Y, indexD)Normalize the Y array and return its mean and standard deviations.
save_task_network
(checkpoint_file, network, h)Save the meta-model for the task at the given path.
get_architecture
- batch_size: int = 5¶
batch_size
argument of Bohamiann.train.
- get_architecture(classification: bool = False, n_hidden: int = 500) Any ¶
Callable that takes the input dimensionality and returns the network to be trained.
- get_task_network(input_path: Union[Path, str]) Tuple[Any, ndarray] [source]¶
Create, train and return a surrogate model for the given benchmark,
seed
andtask_id
.- Parameters
- input_pathUnion[Path, str]
Data directory containing the json files.
- Returns
- Tuple[Any, np.ndarray]
The surrogate model for the objective, as well as an array of sampled task features.
Size of the hidden space for this benchmark.
- load_data(input_path: Union[str, Path]) Tuple[ndarray, ndarray, ndarray] [source]¶
Load the profet data for the given benchmark from the input directory.
When the input directory doesn’t exist, attempts to download the data to create the input directory.
- Parameters
- input_pathUnion[str, Path]
Input directory. Expects to find a json file for the given benchmark inside that directory.
- Returns
- Tuple[np.ndarray, np.ndarray, np.ndarray]
X, Y, and C arrays.
- load_task_network(checkpoint_file: Union[str, Path]) Tuple[Any, ndarray] [source]¶
Load the result of the
get_task_network
function stored in the pickle file.- Parameters
- checkpoint_fileUnion[str, Path]
Path to a pickle file. The file is expected to contain a serialized dictionary, with keys “benchmark”, “size”, “network”, and “h”.
- Returns
- Tuple[Any, np.ndarray]
The surrogate model for the objective, as well as an array of sampled task features.
- max_iters: int = 10000¶
Argument passed to the optimize method of the BayesianGPLVM instance that is used in the call to get_features. Appears to be the number of training iterations to perform.
- max_samples: Optional[int] = None¶
Maximum number of data samples to use when training the meta-model. This can be useful if the dataset is large (e.g. FCNet task) and you don’t have crazy amounts of memory.
- mcmc_thining: int = 100¶
keep_every argument of Bohamiann.train.
(copied from Bohamiann.train): Number of sampling steps (after burn-in) to perform before keeping a sample. In total, num_steps // keep_every network weights will be sampled.
- n_inducing_lvm: int = 50¶
Passed as the value for the “num_inducing” argument of BayesianGPLVM constructor.
(copied form
GPy.core.sparse_gp_mpi.SparseGP_MPI
): Number of inducing points (optional, default 10. Ignored if Z is not None)
- normalize_Y(Y: ndarray, indexD: ndarray) Tuple[ndarray, ndarray, ndarray] [source]¶
Normalize the Y array and return its mean and standard deviations.
- Parameters
- Ynp.ndarray
Labels from the datasets.
- indexDnp.ndarray
Task indices of corresponding labels Y.
- Returns
- Tuple[np.ndarray, np.ndarray, np.ndarray]
Tuple containing the Y array, the mean array, and the std array.
- num_burnin_steps: int = 50000¶
(copied from Bohamiann.train): Number of burn-in steps to perform. This value is passed to the given optimizer if it supports special burn-in specific behavior. Networks sampled during burn-in are discarded.
- num_steps: int = 13000¶
Value passed to the argument of the same name in Bohamiann.train.
(copied from Bohamiann.train): Number of sampling steps to perform after burn-in is finished. In total, num_steps // keep_every network weights will be sampled.
- save_task_network(checkpoint_file: Union[str, Path], network: Any, h: ndarray) None [source]¶
Save the meta-model for the task at the given path.
- Parameters
- checkpoint_fileUnion[str, Path]
Path where the model should be saved
- networkAny
The network
- hnp.ndarray
The embedding vector