Electra With Memory-Efficient Compositional Embeddings

Introduction

ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU. At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.

For a detailed description and experimental results, please refer to the ICLR 2020 paper ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators.

COMPOSITIONAL EMBEDDINGS USING COMPLEMENTARY PARTITIONS is a relatively novel approach for reducing the embedding size in an end-to-end fashion by exploiting complementary partitions of the category set to produce a unique embedding vector for each category without explicit definition. It is an effective memory efficient technique to deal with models that have massive vocabulary or high cardinality which can result in bottlenecks in the training process. The authors show that the information loss through the generated complementary embeddings are minimal compared to actual embeddings. This quotient-remainder trick used in the paper is more effective compared to the previous hashing trick.

For a detailed description and experimental results, please refer to the paper Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems.

Motivation

This repository contains code to pre-train ELECTRA with an option to use memory efficient compositional embeddings for datasets that have huge vocabulary sizes. The repository currently only supports text data from CSV files on a single GPU. An example of fine-tuning Electra on a sentiment classification task is provided. This code is ideal for researchers to use on non English datasets, recommendation engine data, MIDI files, etc.

Pretraining

Use pretraining.py to pre-train an ELECTRA model. It has the following arguments:

--raw_data_loc (optional): raw data location of csv file containing text sentences.
--col_name (optional): name of text column in dataset to use for pretraining.
--working_dir (optional): location of directory to store model weights, configs and vocabulary tokens.
--hparams (optional): a dict containing model hyperparameters. See Pretraining_Config.py under Configs folder for the supported default hyperparameters. To override any of the default hyperparameters, pass those as a dictionary. For example: --hparams {"hparam1": value1, "hparam2": value2, ...}.

To see an example notebook of pretraining the Electra model, see Pretraining.ipynb and Pretraining_Compositional_Embeddings.ipynb.

Fine Tuning

Use FineTuning.py to fine tune the pre-trained ELECTRA model for a sentiment classification task. It has the following arguments:

--raw_data_loc (optional): raw data location of csv file containing text sentences.
--working_dir (optional): location of directory to load the pretrained model weights, configs and vocabulary tokens.
--hparams (optional): a dict containing model hyperparameters. See Finetuning_Config.py under Configs folder for the supported default hyperparameters. To override any of the default hyperparameters, pass those as a dictionary. For example: --hparams {"hparam1": value1, "hparam2": value2, ...}.

To see an example notebook of fine tuning the Electra model, see FineTuning.ipynb and FineTuning_Compositional_Embeddings.ipynb.

Setup

Fork this repository and run the pretraining example (instructions above) on the data provided to familiarize yourself with the repository and parameters. To use on your own data, create a csv file with a column of text. Feel free to change the code as needed to support your own requirements.

Contact Info

For issues related to the repository, please raise a GitHub issue or contact me at keshavbhandari@gmail.com

Please star the repository if you find it useful! Thanks :)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Configs		Configs
Data		Data
Model		Model
Notebooks		Notebooks
Output		Output
FineTuning.py		FineTuning.py
Pretraining.py		Pretraining.py
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configs

Configs

Data

Data

Model

Model

Notebooks

Notebooks

Output

Output

FineTuning.py

FineTuning.py

Pretraining.py

Pretraining.py

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Electra With Memory-Efficient Compositional Embeddings

Introduction

Motivation

Pretraining

Fine Tuning

Setup

Contact Info

About

Releases

Packages

Languages

keshavbhandari/Electra-With-Compositional-Embeddings

Folders and files

Latest commit

History

Repository files navigation

Electra With Memory-Efficient Compositional Embeddings

Introduction

Motivation

Pretraining

Fine Tuning

Setup

Contact Info

About

Topics

Resources

Stars

Watchers

Forks

Languages