Simple example

Installation and setup

In this tutorial you will run a very simple MNIST example in pytorch using Oríon. First, install Oríon follwing Installation of Orion’s core and configure the database (Setup Database). Then install pytorch, torchvision and clone the PyTorch examples repository:

$ pip3 install torch torchvision
$ git clone git@github.com:pytorch/examples.git

Adapting the code of MNIST example

After cloning pytorch examples repository, cd to mnist folder:

$ cd examples/mnist

In your favourite editor add a shebang line #!/usr/bin/env python to the main.py and make it executable, for example:

$ sed -i '1s/^/#!/usr/bin/env python/' main.py
$ chmod +x main.py

At the top of the file, below the imports, add one line of import the helper function orion.client.report_results():

from orion.client import report_results

We are almost done now. We need to add a line to the function test() so that it returns the error rate.

return 1 - (correct / len(test_loader.dataset))

And finally, we get back this test error rate and call report_results to return the objective value to Oríon. Note that report_results is meant to be called only once, this is because Oríon only optimizes looking at 1 objective value.

    test_error_rate = test(args, model, device, test_loader)

report_results([dict(
    name='test_error_rate',
    type='objective',
    value=test_error_rate)])

You can also return result types of 'gradient' and 'constraint' for algorithms which supports those results as well.

Important note here, we use test error rate for sake of simplicity, because the script does not contain validation dataset loader as-is, but we should never optimize our hyper-parameters on the test set. We should always use a validation set.

Another important note, Oríon will always minimize the objective so make sure you never try to optimize something like the accuracy of the model unless you are looking for very very bad models.

Execution

Once the script is adapted, optimizing the hyper-parameters with Oríon is rather simple. Normally you would call the script the following way.

$ ./main.py --lr 0.01

To use it with Oríon, you simply need to prepend the call with orion hunt -n <some name> and specify the hyper-parameter prior distributions.

$ orion hunt -n orion-tutorial ./main.py --lr~'loguniform(1e-5, 1.0)'

This commandline call will sequentially execute ./main.py --lr=<value> with random values sampled from the distribution loguniform(1e-5, 1.0). We support all distributions from scipy.stats, plus choices() for categorical hyper-parameters (similar to numpy’s choice function).

Experiments are interruptible, meaning that you can stop them either with <ctrl-c> or with kill signals. If your script is not resumable automatically then resuming an experiment will restart your script from scratch.

You can resume experiments using the same commandline or simply by specifying the name of the experiment.

$ orion hunt -n orion-tutorial

Note that experiment names are unique, you cannot create two different experiment with the same name.

You can also register experiments without executing them.

$ orion init_only -n orion-tutorial ./main.py --lr~'loguniform(1e-5, 1.0)'

Debugging

When preparing a script for hyper-parameter optimization, we recommend first testing with debug mode. This will use an in-memory database which will be flushed at the end of execution. If you don’t use --debug you will likely quickly fill your database with broken experiments.

$ orion --debug hunt -n orion-tutorial ./main.py --lr~'loguniform(1e-5, 1.0)'

Hunting Options

$ orion hunt --help

Oríon arguments (optional):
  These arguments determine orion's behaviour

  -n stringID, --name stringID
                        experiment's unique name; (default: None - specified
                        either here or in a config)
  -u USER, --user USER  user associated to experiment's unique name; (default:
                        $USER - can be overriden either here or in a config)
  -c path-to-config, --config path-to-config
                        user provided orion configuration file
  --max-trials #        number of trials to be completed for the experiment.
                        This value will be saved within the experiment
                        configuration and reused across all workers to
                        determine experiment's completion. (default: inf/until
                        preempted)
  --worker-trials #     number of trials to be completed for this worker. If
                        the experiment is completed, the worker will die even
                        if it did not reach its maximum number of trials
                        (default: inf/until preempted)
  --working-dir WORKING_DIR
                        Set working directory for running experiment.
  --pool-size #         number of simultaneous trials the algorithm should
                        suggest. This is useful if many workers are executed
                        in parallel and the algorithm has a strategy to sample
                        non-independant trials simultaneously. Otherwise, it
                        is better to leave `pool_size` to 1 and set a Strategy
                        for Oríon's producer. (default: 1)

name

The unique name of the experiment.

user

Username used to identify the experiments of a user. The default value is the system’s username $USER.

config

Configuration file for Oríon which may define the database, the algorithm and all options of the command hunt, including name, pool-size and max-trials.

max-trials

The maximum number of trials tried during an experiment.

worker-trials

The maximum number of trials to be executed by a worker (a single call to orion hunt [...]).

working-dir

The directory where configuration files are created. If not specified, Oríon will create a temporary directory that will be removed at end of execution of the trial.

pool-size

The number of trials which are generated by the algorithm each time it is interrogated. This is useful if many workers are executed in parallel and the algorithm has a strategy to sample non-independant trials simultaneously. Otherwise, it is better to leave pool_size to its default value 1.

Results

When an experiment reaches its termination criterion, basically max-trials, it will print the following statistics if Oríon is called with -v or -vv.

RESULTS
=======
{'best_evaluation': 0.05289999999999995,
 'best_trials_id': 'b7a741e70b75f074208942c1c2c7cd36',
 'duration': datetime.timedelta(0, 49, 751548),
 'finish_time': datetime.datetime(2018, 8, 30, 1, 8, 2, 562000),
 'start_time': datetime.datetime(2018, 8, 30, 1, 7, 12, 810452),
 'trials_completed': 5}

BEST PARAMETERS
===============
[{'name': '/lr', 'type': 'real', 'value': 0.012027705702344259}]

These results can be printed in terminal later on with the command info or fetched using the library API.