Skip to content

rtu715/NAS-Bench-360

Repository files navigation

NAS-Bench-360

This codebase reproduces various empirical evaluations on NAS-Bench-360, a benchmark for evaluating neural architecture search on diverse tasks, that can be found in the associated paper.

Resources

Oct 2022: Please use this link for all dataset and precompute downloads as we are transitioning our AWS resources to the Open Data Sponsorship program:

Shared Drive

Datasets in the benchmark with download links ( old links, use the shared drive above! ):

Precomputed evaluation benchmark files on the NB201 search space (following NATS-Bench):

For full outputs (include training logs and all weights and checkpoints), please contact the administrators. They are of size ~40 GB.

Prerequisites for main NAS experiments

We use the open-source Determined software to implement experiment code.

Installing determined: pip install determined

A master instance is required:

  • for local deployment (need to install docker):

    • to start the master: det deploy local cluster-up
    • access the WebUI at http://localhost:8080
    • to shut down: det deploy local cluster-down
  • for AWS deployment (preferred):

    • install AWS CLI
    • Run aws configure and find AWS EC2 keypair name
    • to start the master: det deploy aws up --cluster-id CLUSTER_ID --keypair KEYPAIR_NAME
    • access the WebUI at {ec2-instance-uri}:8080
    • to shut down: det deploy aws down --cluster-id CLUSTER_ID

For an end-to-end example of running experiments with determined, you can refer to this video.

When running experiments, a docker image is automatically pulled from docker hub which contains all required python packages , i.e. you don't need to install them yourself, and it ensures reproducibility.

Main NAS Experiments Reproduction

We provide pytorch implementations for two state-of-the-art NAS algorithms: GAEA PC-DARTS (paper link) and DenseNAS (paper link), which can be found inside each folder with the associated name, i.e. darts/ for GAEA PC-DARTS and densenas/ for DenseNAS.

To run these algorithms on 1D tasks, we've adapted their search spaces whose experiments are provided in darts_1d/ for GAEA PC-DARTS (1D) and densenas_1d/ for DenseNAS(1D).

Two task-specific NAS methods are implemented: Auto-DeepLab for dense prediction tasks in autodeeplab/ and AMBER for 1D prediction tasks in AMBER/.

We also implement procedure for running and tuning hyperparameters of the backbone architecture Wide ResNet (paper link), in backbone/. The 1D-customized Wide ResNet is in backbone_1d/.

To modify the random seed for each experiment, modify the number under

reproducibility: experiment_seed: for each script

Additional Baseline Experiments

We also evaluate the performance of non-NAS baselines for comparison:

  • Expert architectures for each dataset: see expert.
  • Perceiver-IO: see perceiver-io.
  • XGBoost: see xgboost.

Precomputed results on NinaPro and DarcyFlow

  • See the precompute directory for NAS algorithms from NATS-Bench and reproduction of the precomputed benchmark.

Baselines

Performance of NAS and baselines across NAS-Bench-360. Methods are divided into efficient methods (e.g. DenseNAS and fixed WRN) that take 1-10 GPU-hours, more expensive methods (e.g. DARTS and tuned WRN) that take 10-100+ GPU-hours, and specialized methods (Auto-DL and AMBER). All results are averages of three random seeds, and lower is better for all metrics. alt text

Citation

If you find this project helpful, please consider citing our paper:

@inproceedings{
  tu2022nasbench,
  title={{NAS}-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks},
  author={Renbo Tu and Nicholas Roberts and Mikhail Khodak and Junhong Shen and Frederic Sala and Ameet Talwalkar},
  booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2022},
  url={https://openreview.net/forum?id=xUXTbq6gWsB}
}

Thanks!