Storage¶

Commands are available to help configure, test and upgrade the storage of Oríon. There is additionally commands to delete experiment and trials or update values in the storage.

For more flexibility, there is the dump Export database content.

Commands¶

`setup` Storage configuration¶

The setup command helps creating a global configuration file for the configuration of Oríon’s storage. For more details on its usage see Configuring the database in the database installation and configuration section.

`test` Test storage configuration¶

The test command provides a simple and efficient way of testing the storage configuration. For more details on its usage see Testing the configuration in the database installation and configuration section.

`rm` Delete data from storage¶

Command to delete experiments and trials.

To delete an experiment and its trials, simply give the experiment’s name.

orion db rm my-exp-name

To delete only trials that are broken, simply add --status broken. Note that the experiment will not be deleted, only the trials.

orion db rm my-exp-name --status broken

Or --status * to delete all trials of the experiment.

orion db rm my-exp-name --status *

By default, the last version of the experiment is deleted. Add --version to select a prior version. Note that all child of the selected version will be deleted as well. You cannot delete a parent experiment without deleting the child experiments.

orion db rm my-exp-name --version 1

`set` Change value of data in storage¶

Command to update trial attributes.

To change a trial status, simply give the experiment name, trial id and status. (use orion status –all to get trial ids)

orion db set my-exp-name id=3cc91e851e13281ca2152c19d888e937 status=interrupted

To change all trials from a given status to another, simply give the two status

orion db set my-exp-name status=broken status=interrupted

Or * to apply the change to all trials

orion db set my-exp-name '*' status=interrupted

By default, trials of the last version of the experiment are selected. Add –version to select a prior version. Note that the modification is applied recursively to all child experiment, but not to the parents.

orion db set my-exp-name --version 1 status=broken status=interrupted

`release` algorithm lock¶

The algorithm state is saved in the storage so that it can be shared across main process ($ orion hunt or experiment_client.workon()). The algorithm state is locked during the time the algorithm is updated by observing completed trials or during the suggestion of new trials. Sometimes the process may be killed while the algorithm is locked leading to a dead lock. The lock can be manually released using the orion db release.

orion db release my-exp-name --version 1

Make sure you have no Orion process running with this experiment while executing this command or you risk having an algorithm state saved in the storage that is inconsistent with the trials saved in the storage.

`upgrade` Upgrade database scheme¶

Database scheme may change from one version of Oríon to another. If such change happens, you will get the following error after upgrading Oríon.

The database is outdated. You can upgrade it with the command `orion db upgrade`.

Make sure to create a backup of your database before upgrading it. You should also make sure that no process writes to the database during the upgrade otherwise the latter could fail. When ready, simply run the upgrade command.

orion db upgrade

`dump` Export database content¶

The dump command allows to export database content to a PickledDB PKL file.

orion db dump -o backup.pkl

You can also dump a specific experiment.

orion db dump -n exp-name -v exp-version -o backup-exp.pkl

`load` Import database content¶

The load command allows to import database content from any PickledDB PKL file (including files generated by dump command).

You must specify a conflict resolution policy using -r/--resolve argument to apply when conflicts are detected during import. Available policies are:

ignore, to ignore imported data
overwrite, to replace old data with imported data
bump, to bump version of imported data and then make import

By default, whole PKL file will be imported.

orion db load backup.pkl -r ignore

You can also import a specific experiment.

orion db load backup.pkl -r overwrite -n exp-name -v exp-version

Python APIs¶

In short, users are expected to only use the ExperimentClient to interact with the storage client, to fetch and register trials. Creation of experiments should always be done through create_experiment().

If you need to access the storage with more flexibility, you can do so using the methods of the storage client directly. See Storage section for more details.

Finally, legacy databases supported by Oríon can also be accessed directly in last resort if the storage backend is not flexible enough. See Database section for more details.

ExperimentClient¶

The experiment client must be created with the helper function get_experiment() which will take care of initiating the storage backend and load the corresponding experiment from the storage. To create a new experiment use create_experiment().

There is a small subset of methods to fetch trials or register new ones. We focus here on the methods for loading or creation of trials in particular, see ExperimentClient for documentation of all methods.

The experiment client can be loaded in read-only or read/write mode. Make sure to load the experiment with the proper mode if you want to edit the database. For full read/write/execution rights, use create_experiment().

Here is a short example to fetch trials or insert a new one.

from orion.client import create_experiment

# Create the ExperimentClient
experiment = create_experiment('exp-name', space=dict(x='uniform(0, 1)'))

# To fetch all trials from an experiment
trials = experiment.fetch_trials()

# To fetch trials in a form on panda dataframe
df = experiment.to_pandas()

# Insert a new trial in storage
experiment.insert(dict(x=0.5))

# Insert a new trial and reserve to execute
trial = experiment.insert(dict(x=0.6), reserve=True)

to_pandas¶

ExperimentClient.to_pandas(with_evc_tree=False)[source]

Builds a dataframe with the trials of the experiment

Parameters

with_evc_tree: bool, optional: Fetch all trials from the EVC tree. Default: False

fetch_trials¶

ExperimentClient.fetch_trials(with_evc_tree=False) → list[Trial][source]

Fetch all trials of the experiment

Parameters

with_evc_tree: bool, optional: Fetch all trials from the EVC tree. Default: False

fetch_trials_by_status¶

ExperimentClient.fetch_trials_by_status(status, with_evc_tree=False)[source]

Fetch all trials with the given status

Trials are sorted based on Trial.submit_time

Returns: list of orion.core.worker.trial.Trial objects

fetch_noncompleted_trials¶

ExperimentClient.fetch_noncompleted_trials(with_evc_tree=False)[source]

Fetch non-completed trials of this Experiment instance.

Trials are sorted based on Trial.submit_time

Note

It will return all non-completed trials, including new, reserved, suspended, interrupted and broken ones.

Returns: list of non-completed orion.core.worker.trial.Trial objects

get_trial¶

ExperimentClient.get_trial(trial=None, uid=None)[source]

Fetch a single trial

Parameters

trial: Trial, optional: trial object to retrieve from the database
uid: str, optional: trial id used to retrieve the trial object

Returns

return none if the trial is not found,

Raises

UndefinedCall: if both trial and uid are not set
AssertionError: if both trial and uid are provided and they do not match

insert¶

ExperimentClient.insert(params, results=None, reserve=False)[source]

Insert a new trial.

Experiment must be in writable (‘w’) or executable (‘x’) mode.

Parameters

params: dict: Parameters of the new trial to add to the database. These parameters must comply with the space definition otherwise a ValueError will be raised.
results: list, optional: Results to be set for the new trial. Results must have the format {name: <str>: type: <’objective’, ‘constraint’ or ‘gradient’>, value=<float>} otherwise a ValueError will be raised. Note that passing results will mark the trial as completed and therefore cannot be reserved. The returned trial will have status ‘completed’. If the results are invalid, the trial will still be inserted but reservation will be released.
reserve: bool, optional: If reserve=True, the inserted trial will be reserved. reserve cannot be True if results are given. Defaults to False.

Returns

orion.core.worker.trial.Trial: The trial inserted in storage. If reserve=True and no results are given, the returned trial will be in a reserved status.

Raises

ValueError

If results are given and reserve=True
If params have invalid format
If results have invalid format

orion.core.io.database.DuplicateKeyError

If a trial with identical params already exist for the current experiment.

orion.core.utils.exceptions.UnsupportedOperation

If the experiment was not loaded in writable mode.

Storage¶

Warning

The storage backends are not meant to be used directly by users. Be careful if you use any method which modifies the data in storage or you may break your experiment or trials.

The storage backend is used by the ExperimentClient to read and write persistent records of the experiment and trials. Although we recommend using the experiment client, we document the storage backend here for users who may need more flexibility.

You should try to use a single storage instance for each physical storage to minimize the amount of locking and/or connections. If the storage is otherwise unreachable you can create a new storage client with orion.storage.base.setup_storage().

To recap, you can create it indirectly with create_experiment() or directly with setup_storage().

from orion.client import create_experiment
from orion.storage.base import setup_storage

# Create the ExperimentClient and storage implicitly
experiment = create_experiment('exp-name', space=dict(x='uniform(0, 1)'))

# Or create storage explicitly using setup_storage
storage = setup_storage(dict(
    type='legacy',
    database=dict(
        type='pickleddb',
        host='db.pkl')
        )
    )
)

# fetch trials
trials = storage.fetch_trials(uid=experiment.id)

# Update trial status
storage.set_trial_status(trials[0], 'interrupted')

Note

The function setup_storage() reads the global configuration like create_experiment() does if there is missing information. Therefore, it is possible to call it without any argument the same way it is possible to call create_experiment() without specifying storage configuration.

update_experiment¶

BaseStorageProtocol.update_experiment(experiment: Experiment | None = None, uid: str | int | None = None, where: dict | None = None, **kwargs: Unpack[PartialExperimentConfig]) → bool[source]

Update the fields of a given experiment

Parameters

experiment: Experiment, optional: experiment object to retrieve from the database
uid: str or int, optional: experiment id used to retrieve the trial object
where: Optional[dict]: constraint experiment must respect
**kwargs: dict: a dictionary of fields to update

Returns

returns true if the underlying storage was updated

Raises

UndefinedCall: if both experiment and uid are not set
AssertionError: if both experiment and uid are provided and they do not match

fetch_experiments¶

BaseStorageProtocol.fetch_experiments(query: dict, selection: dict | None = None) → list[ExperimentConfig][source]: Fetch all experiments that match the query

delete_experiment¶

BaseStorageProtocol.delete_experiment(experiment: Experiment | None = None, uid: str | int | None = None)[source]

Delete matching experiments from the database

Parameters

experiment: Experiment, optional: experiment object to retrieve from the database
uid: str or int, optional: experiment id used to retrieve the trial object

Returns

Number of experiments deleted.

Raises

UndefinedCall: if both experiment and uid are not set
AssertionError: if both experiment and uid are provided and they do not match

register_trial¶

BaseStorageProtocol.register_trial(trial: Trial)[source]: Create a new trial to be executed

reserve_trial¶

BaseStorageProtocol.reserve_trial(experiment: Experiment) → Trial | None[source]

Select a pending trial and reserve it for the worker

Returns

Returns the reserved trial or None if no trials were found

fetch_trials¶

Fetch all the trials of an experiment in the database

Parameters

experiment: Experiment, optional: experiment object to retrieve from the database
uid: str or int, optional: experiment id used to retrieve the trial object
where: Optional[dict]: constraint trials must respect

Returns

return none if the experiment is not found,

Raises

UndefinedCall: if both experiment and uid are not set
AssertionError: if both experiment and uid are provided and they do not match

delete_trials¶

BaseStorageProtocol.delete_trials(experiment: Experiment | None = None, uid: str | int | None = None, where: dict | None = None) → int[source]

Delete matching trials from the database

Parameters

experiment: Experiment, optional: experiment object to retrieve from the database
uid: str or int, optional: experiment id used to retrieve the trial object
where: Optional[dict]: constraint trials must respect

Returns

Number of trials deleted.

Raises

UndefinedCall: if both experiment and uid are not set
AssertionError: if both experiment and uid are provided and they do not match

get_trial¶

Fetch a single trial

Parameters

trial: Trial, optional: trial object to retrieve from the database
uid: str, optional: trial id used to retrieve the trial object
experiment_uid: str or int, optional: experiment id used to retrieve the trial object

Returns

return None if the trial is not found,

Raises

UndefinedCall: if both trial and uid are not set
AssertionError: if both trial and uid are provided and they do not match

update_trials¶

BaseStorageProtocol.update_trials(experiment: Experiment | None = None, uid: str | int | None = None, where: dict | None = None, **kwargs)[source]

Update trials of a given experiment matching a query

Parameters

experiment: Experiment, optional: experiment object to retrieve from the database
uid: str or int, optional: experiment id used to retrieve the trial object
where: Optional[dict]: constraint trials must respect
**kwargs: dict: a dictionary of fields to update

Raises

UndefinedCall: if both experiment and uid are not set
AssertionError: if both experiment and uid are provided and they do not match

update_trial¶

Update fields of a given trial

Parameters

trial: Trial, optional: trial object to update in the database
uid: str, optional: id of the trial to update in the database
experiment_uid: str or int, optional: experiment id of the trial to update in the database
where: Optional[dict]: constraint trials must respect. Note: useful to handle race conditions.
**kwargs: dict: a dictionary of fields to update

Raises

UndefinedCall: if both trial and uid are not set
AssertionError: if both trial and uid are provided and they do not match

fetch_lost_trials¶

BaseStorageProtocol.fetch_lost_trials(experiment: Experiment) → list[Trial][source]: Fetch all trials that have a heartbeat older than some given time delta (2 minutes by default)

fetch_pending_trials¶

BaseStorageProtocol.fetch_pending_trials(experiment: Experiment) → list[Trial][source]: Fetch all trials that are available to be executed by a worker, this includes new, suspended and interrupted trials

fetch_noncompleted_trials¶

BaseStorageProtocol.fetch_noncompleted_trials(experiment: Experiment) → list[Trial][source]: Fetch all non completed trials

fetch_trials_by_status¶

BaseStorageProtocol.fetch_trials_by_status(experiment: Experiment, status: str) → list[Trial][source]: Fetch all trials with the given status

count_completed_trials¶

BaseStorageProtocol.count_completed_trials(experiment: Experiment) → int[source]: Count the number of completed trials

count_broken_trials¶

BaseStorageProtocol.count_broken_trials(experiment: Experiment) → int[source]: Count the number of broken trials

set_trial_status¶

BaseStorageProtocol.set_trial_status(trial: Trial, status: str, heartbeat: datetime | None = None, was: str | None = None)[source]

Update the trial status and the heartbeat

Parameters

trial: `Trial` object: Trial object to update in the database.
status: str: Status to be set to the trial
heartbeat: datetime, optional: New heartbeat to update simultaneously with status
was: str, optional: The status the trial should be set to in the database. If None, current trial.status will be used. This is used to ensure coherence in the database, protecting against race conditions for instance.

Raises

FailedUpdate: The exception is raised if the status of the trial object does not match the status in the database

Database¶

Warning

The database backends are not meant to be used directly by users. Be careful if you use any method which modifies the data in database or you may break your experiment or trials.

The database backend used to be the sole database support initially. An additional abstraction layer, the storage protocol, has been added with the goal to support various storage types such as third-party experiment management platforms which could not be supported using the basic methods read and write. This is why the database backend has been turned into a legacy storage procotol. Because it is the default storage protocol, we document it here for users who may need even more flexibility than what the storage protocol provides.

There is two ways for creating the database client. If you already created an experiment client, the database was already created during the process of creating the experiment client and you can get it with orion.storage.legacy.get_database(). Otherwise, you can create the database client with orion.storage.legacy.setup_database() before fetching it with get_database(). To recap, you can create it indirectly with create_experiment() or directly with setup_database(). In both case, you can access it with get_database().

Here’s an example on how you could remove an experiment

from orion.client import create_experiment
from orion.storage.legacy import get_database, setup_database

# Create the ExperimentClient and database implicitly
experiment = create_experiment('exp-name', space=dict(x='uniform(0, 1)'))

# Or create database explicitly using setup_database
setup_database(dict(
    type='pickleddb',
    host='db.pkl'
    )
)

# This gets the db singleton that was already instantiated within the experiment object.
db = get_database()

# To remove all trials of an experiment
db.remove('trials', dict(experiment=experiment.id))

# To remove the experiment
db.remove('experiments', dict(_id=experiment.id))

read¶

abstract Database.read(collection_name, query=None, selection=None)[source]

Read a collection and return a value according to the query.

Parameters

collection_name: str: A collection inside database, a table.
query: dict, optional: Filter entries in collection.
selection: dict, optional: Elements of matched entries to return, the projection.

Returns

list: List of matched document[s]

write¶

abstract Database.write(collection_name, data, query=None)[source]

Write new information to a collection. Perform insert or update.

Parameters

collection_name: str: A collection inside database, a table.
data: dict or list of dicts: New data that will be inserted or that will update entries.
query: dict, optional: Assumes an update operation: filter entries in collection to be updated.

Returns

int: Number of new documents if no query, otherwise number of modified documents.

Raises

DuplicateKeyError: If the operation is creating duplicate keys in two different documents. Only occurs if the keys have unique indexes. See Database.ensure_index() for more information about indexes.

Notes

In the case of an insert operation, data variable will be updated to contain a unique _id key.

In the case of an update operation, if query fails to find a document that matches, no operation is performed.

remove¶

abstract Database.remove(collection_name, query)[source]

Delete from a collection document[s] which match the query.

Parameters

collection_name: str: A collection inside database, a table.
query: dict: Filter entries in collection.

Returns

int: Number of documents removed

read_and_write¶

abstract Database.read_and_write(collection_name, query, data, selection=None)[source]

Read a collection’s document and update the found document.

If many documents are found, the first one is selected.

Returns the updated document, or None if nothing found.

Parameters

collection_name: str: A collection inside database, a table.
query: dict: Filter entries in collection.
data: dict or list of dicts: New data that will update the entry.
selection: dict, optional: Elements of matched entries to return, the projection.

Returns

dict or None: Updated first matched document or None if nothing found

Raises

DuplicateKeyError: If the operation is creating duplicate keys in two different documents. Only occurs if the keys have unique indexes. See Database.ensure_index() for more information about indexes.

Storage¶

Commands¶

setup Storage configuration¶

test Test storage configuration¶

rm Delete data from storage¶

set Change value of data in storage¶

release algorithm lock¶

upgrade Upgrade database scheme¶

dump Export database content¶

load Import database content¶

Python APIs¶

ExperimentClient¶

to_pandas¶

fetch_trials¶

fetch_trials_by_status¶

fetch_noncompleted_trials¶

get_trial¶

insert¶

Storage¶

update_experiment¶

fetch_experiments¶

delete_experiment¶

register_trial¶

reserve_trial¶

fetch_trials¶

delete_trials¶

get_trial¶

update_trials¶

update_trial¶

fetch_lost_trials¶

fetch_pending_trials¶

fetch_noncompleted_trials¶

fetch_trials_by_status¶

count_completed_trials¶

count_broken_trials¶

set_trial_status¶

Database¶

read¶

write¶

remove¶

read_and_write¶

`setup` Storage configuration¶

`test` Test storage configuration¶

`rm` Delete data from storage¶

`set` Change value of data in storage¶

`release` algorithm lock¶

`upgrade` Upgrade database scheme¶

`dump` Export database content¶

`load` Import database content¶