Skip to content

OptML-KEC/optml-mini-project

Repository files navigation

Mini project - CS-439 Optimization for Machine Learning

Zero Order Adaptive Momentum Method (ZO-AdaMM)

Introduction

This repository contains the code to reproduce the results of the mini-project done in the context of the course CS-439 Optimization for Machine Learning at EPFL.

We propose to study the behavior of the zero order version of the AdaMM algorithm (aka AMSGrad), called ZO-AdaMM. This method was proposed in ZO-AdaMM: Zeroth-Order Adap- tive Momentum Method for Black-Box Optimization, Xiangyi Chen et al.

In particular, we empirically studied this optimizer with simple CNN ranging from 1'400 to more than 2.5 millions parameters on the well known classification task of the MNIST dataset.

Structure of the repository

├── models
    ├── scalable_model.py    # Scalable (nb. params) CNN
    ├── small_model.py       # Small CNN used for tests
├── optimizers
    ├── adamm.py             # First order AdaMM optimizer
    ├── zo_adamm.py          # Zeroth order AdaMM optimizer
    ├── zo_sgd.py            # Zeroth order SGD optimizer
    ├── scheduler.py         # Learning rate scheduler
├── plots
├── results                  # Results of the experiment notebook
    ├── weights              # Recorded weights in the experiment notebook
├── main.py                  # Main functions to setup the training of a model, run the experiments
├── utils.py                 # Functions to train a model and some utilitaries functions
├── experiments.ipynb        # Notebook containing the experiments (models training)
├── analysis.ipynb           # Notebook containing the analysis of the experiments, with plots
├── report.pdf               # The report of the project
├── requirements.txt         # List of all the packages needed to run our code
└── README.md                # You are here

Reproducing our results

The libraries required to run our code can be found in requirements.txt.

The results can be reproduced as follows :

  • Run the experiments.ipynb to produce the data
  • Run the analysis.ipynb to produce the plots used in the report

Remarks :

  • You need to create the folders ./results and ./results/weights if they do not exist in your system.
  • The zero order optimization method can only be used on the CPU as it produced different behaviors on different machines when using the GPU, with some GPU achieving a lower accuracy and higher losses compared to CPU results, even while using same random seeds. If you still want to use the GPU, you can comment line 54 and decomment line 55 in main.py.

Authors

  • Kieran Vaudaux
  • Elia Fantini
  • Cyrille Pittet

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published