Text Classification Benchmark

A Benchmark of Text Classification in PyTorch

Motivation

We are trying to build a Benchmark for Text Classification including

Many Text Classification DataSet, including Sentiment/Topic Classfication, popular language(e.g. English and Chinese). Meanwhile, a basic word embedding is provided.

Implment many popular and state-of-art Models, especially in deep neural network.

Have done

We have done some dataset and models

Dataset done

IMDB
SST
Trec

Models done

FastText
BasicCNN (KimCNN,MultiLayerCNN, Multi-perspective CNN)
InceptionCNN
LSTM (BILSTM, StackLSTM)
LSTM with Attention (Self Attention / Quantum Attention)
Hybrids between CNN and RNN (RCNN, C-LSTM)
Transformer - Attention is all you need
ConS2S
Capsule
Quantum-inspired NN

Libary

You should have install these librarys

python3
torch
torchtext (optional)

Dataset

Dataset will be automatically configured in current path, or download manually your data in Dataset, step-by step.

including

Glove embeding
Sentiment classfication dataset IMDB

usage

Run in default setting

python main.py

CNN

python main.py --model cnn

LSTM

python main.py --model lstm

Road Map

Organisation of the repository

The core of this repository is models and dataset.

dataloader/: loading all dataset such as IMDB, SST
models/: creating all models such as FastText, LSTM,CNN,Capsule,QuantumCNN ,Multi-Head Attention
opts.py: Parameter and config info.
utils.py: tools.
dataHelper: data helper

Contributor

Welcome your issues and contribution!!!

Name		Name	Last commit message	Last commit date
Latest commit History 205 Commits
config		config
dataloader		dataloader
docs		docs
models		models
LICENSE.txt		LICENSE.txt
README.md		README.md
dataHelper.py		dataHelper.py
main.py		main.py
opts.py		opts.py
push.bash		push.bash
search.sh		search.sh
trandition.py		trandition.py
utils.py		utils.py

License

FreedomIntelligence/TextClassificationBenchmark

Folders and files

Latest commit

History

Repository files navigation

Text Classification Benchmark

Motivation

Have done

Dataset done

Models done

Libary

Dataset

usage

Road Map

Organisation of the repository

Contributor

About

Topics

Resources

License

Stars

Watchers

Forks

Languages