Mini project - CS-439 Optimization for Machine Learning

Zero Order Adaptive Momentum Method (ZO-AdaMM)

Introduction

This repository contains the code to reproduce the results of the mini-project done in the context of the course CS-439 Optimization for Machine Learning at EPFL.

We propose to study the behavior of the zero order version of the AdaMM algorithm (aka AMSGrad), called ZO-AdaMM. This method was proposed in ZO-AdaMM: Zeroth-Order Adap- tive Momentum Method for Black-Box Optimization, Xiangyi Chen et al.

In particular, we empirically studied this optimizer with simple CNN ranging from 1'400 to more than 2.5 millions parameters on the well known classification task of the MNIST dataset.

Structure of the repository

├── models
    ├── scalable_model.py    # Scalable (nb. params) CNN
    ├── small_model.py       # Small CNN used for tests
├── optimizers
    ├── adamm.py             # First order AdaMM optimizer
    ├── zo_adamm.py          # Zeroth order AdaMM optimizer
    ├── zo_sgd.py            # Zeroth order SGD optimizer
    ├── scheduler.py         # Learning rate scheduler
├── plots
├── results                  # Results of the experiment notebook
    ├── weights              # Recorded weights in the experiment notebook
├── main.py                  # Main functions to setup the training of a model, run the experiments
├── utils.py                 # Functions to train a model and some utilitaries functions
├── experiments.ipynb        # Notebook containing the experiments (models training)
├── analysis.ipynb           # Notebook containing the analysis of the experiments, with plots
├── report.pdf               # The report of the project
├── requirements.txt         # List of all the packages needed to run our code
└── README.md                # You are here

Reproducing our results

The libraries required to run our code can be found in requirements.txt.

The results can be reproduced as follows :

Run the experiments.ipynb to produce the data
Run the analysis.ipynb to produce the plots used in the report

Remarks :

You need to create the folders ./results and ./results/weights if they do not exist in your system.
The zero order optimization method can only be used on the CPU as it produced different behaviors on different machines when using the GPU, with some GPU achieving a lower accuracy and higher losses compared to CPU results, even while using same random seeds. If you still want to use the GPU, you can comment line 54 and decomment line 55 in main.py.

Authors

Kieran Vaudaux
Elia Fantini
Cyrille Pittet

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
models		models
optimizers		optimizers
plots		plots
results		results
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
analysis.ipynb		analysis.ipynb
experiments.ipynb		experiments.ipynb
main.py		main.py
report.pdf		report.pdf
requirements.txt		requirements.txt
utils.py		utils.py

OptML-KEC/optml-mini-project

Folders and files

Latest commit

History

Repository files navigation

Mini project - CS-439 Optimization for Machine Learning

Zero Order Adaptive Momentum Method (ZO-AdaMM)

Introduction

Structure of the repository

Reproducing our results

Authors

About

Resources

Stars

Watchers

Forks

Languages