Skip to content

Latest commit

 

History

History
72 lines (50 loc) · 2.19 KB

README.md

File metadata and controls

72 lines (50 loc) · 2.19 KB

Minimum Description Length Recurrent Neural Networks

license license code style arXiv

Code for Minimum Description Length Recurrent Neural Networks by Nur Lan, Michal Geyer, Emmanuel Chemla, and Roni Katzir.

Paper: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00489/

Getting started

  1. Install Python >= 3.7
  2. pip install -r requirements.txt

On Ubuntu, install:

$ apt-get install libsm6 libxext6 libxrender1 libffi-dev libopenmpi-dev

Running simulations

$ python main.py --simulation <simulation_name> -n <number_of_islands>

For example, to run the aⁿbⁿcⁿ task using 16 island processes:

$ python main.py --simulation an_bn_cn -n 16
  • All simulations are available in simulations.py

  • Final and intermediate solutions are saved to the networks sub-directory, both as pickle and in visual dot format.

PyTorch conversion

Converting a network trained using the genetic algorithm to a PyTorch module:

import torch_conversion

with open("networks/net.pickle", "rb") as f:
    net = pickle.load(f)

torch_net = torch_conversion.mdlnn_to_torch(net)

Then fine-tune and evaluate using MDLRNN-torch.

Parallelization

Native Python multiprocessing is used by default. To use MPI, change migration_channel to mpi in simulations.py.

Citing this work

@article{Lan-Geyer-Chemla-Katzir-MDLRNN-2022,
  title = {Minimum Description Length Recurrent Neural Networks},
  author = {Lan, Nur and Geyer, Michal and Chemla, Emmanuel and Katzir, Roni},
  year = {2022},
  month = jul,
  journal = {Transactions of the Association for Computational Linguistics},
  volume = {10},
  pages = {785--799},
  issn = {2307-387X},
  doi = {10.1162/tacl_a_00489},
}