Evolly was designed as a flexible framework with ability to be integrated in any DL pipeline.
Complete following steps to boost your model's accuracy:
- Create config file
- Create model building pipeline
- Create training pipeline
- Run evolution
Let's assume that our goal is to build a classification model for Fashion MNIST dataset.
We included Tensorflow and PyTorch code examples below to get you familiar with Evolly.
NOTE: We've fully tested only Tensorflow blocks of the Evolly. Torch blocks may have some bugs.
It's highly recommended to use yacs config for storing project settings.
Following variables should be used in your training pipeline:
Variable name | Purpose | Required |
---|---|---|
cfg.genotype.branches |
Backbone architecture (genotype) | + |
cfg.genotype.branch_names |
Names of corresponding branches | + |
cfg.train.epochs |
Number of epochs to train model | + |
cfg.train.accelerators |
List of accelerators which would be used during training | + |
cfg.train.accelerator_type |
Type of the accelerators: 'GPU' / 'CPU' / 'TPU' | + |
cfg.train.save_dir |
Path to directory where weights and metadata will be saved after training | + |
cfg.model.name |
Name of weights and metadata files (without format) | + |
cfg.val.metric_op |
Whether to maximize ('max') or minimize ('min') target metric | + |
cfg.model.generation_id |
Generation ID of the model (used for making cfg.model.name) | + |
cfg.model.model_id |
Model identifier (used for making cfg.model.name) | + |
cfg.model.parent_id |
Parent's Model ID | - |
cfg.model.parent |
Path to the parent's weights | - |
Backbone architecture must be stored in a cfg.genotype.branches
variable as a list of branches:
[branch1, branch2, ..., branchN], where each branchN is a list of branch blocks.
Each block is a list of its parameters:
[block_id, depth_id, block_type, kernel_size, strides, filters_out, dropout]
Example of config file you can find here.
It's also recommended to create a standalone .py
file with a model building.
Inside that file unpack genotype with unpack_genotype
function,
then call build_model
function to create a model.
Evolly parses genotype and builds model object with respect to specified arguments
of build_model
function, such as:
- Desired framework - Tensorflow or PyTorch
- Custom initial layers
- Activation function that will be used in each block
- Whether to use default or custom head layers:
- There are 3 default heads: classification, pose estimation, embedding
- Or you can pass your own custom head for solving another task
Backbone consisting of 3 blocks.
Backbone with two blocks on third depth level. You can specify multiple blocks in the same depth level. Outputs of these blocks will be concatenated by filters dimension.
Backbone with two branches. Outputs of multiple branches will be connected after the last block of each branch.
Created model is a tf.keras.Model or torch.nn.Module object of Tensorflow or PyTorch framework respectively. You can utilize it in your training further.
Examples: Tensorflow / PyTorch
With a created model object you can design any training process. You should only make a wrapper function that will:
- Take config as input
- Build model from genotype
- Train model
- Compute fitness value
- Write training logs to metadata dict
- Save weights and metadata
Weights transfer can be implemented between parent and child model. We would recommend utilizing it only if you want to tune backbone. Our experiments showed that usage of weights transfer with a 4 epochs per training isn’t enough to teach blocks that were initialized with random weights. 4 epochs is enough only for training existing parent’s blocks with new configuration (mutated kernel size, strides, filers or dropout).
Examples: Tensorflow / Advanced Tensorflow example / PyTorch
Metadata file must have .json
format and following structure:
Key | Description | Data type |
---|---|---|
train_loss |
Train loss of each epoch | List of floats |
val_metric |
Validation loss of each epoch | List of floats |
config |
Yacs config dump | Dictionary |
parameters |
Model parameters | Int |
flops |
Model flops | Int |
training_time |
Total training time in seconds | Float |
fitness |
Model fitness value | Float |
Scheme of the evolution cycle (one generation) with: two parents, four children, four GPUs and enabled parallel training mode:
Running evolution starts from creating a script where you need to:
- Import training wrapper function, config, Evolly's
Evolution
class and specify accelerators which will be used during evolution - Define evolution parameters
- Start infinite evolution cycle with
Evolution.start
method
Backbone tuning Tensorflow example / Backbone search PyTorch example
Then start TensorBoard process with tensorboard --logdir PATH_TO_LOGS_DIR --port=6006
in a separate terminal.
NOTE: If you are launching Evolly on cloud machine, don't forget to forward TensorBoard port.
Add -L 6006:localhost:6006
to your connection command. For example:
ssh -p SERVER_PORT USERNAME@HOST_IP -L 6006:localhost:6006
After that you can access the TensorBoard console in browser: http://localhost:6006
As soon as fitness metric stops increasing, you can stop evolution by killing process or pressing Ctrl + C if you are running it in terminal window.
When evolution is finished you can utilize analyze_runs
and visualize_run
functions
to find "the best" model and visualize progress:
from evolly import analyze_runs
and
from evolly import visualize_run
Examples:
- Finding "the best" model and building plots - making matplotlib plots and saving them as image.
- Visualizing evolution - making video with evolution progression