Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications

ACL-19: https://www.aclweb.org/anthology/P19-1150/

Requirements: Code is written in Python 3 and requires Pytorch.

Preparation

For quick start, please download the dataset and trained model.

Code Explanation

The data_helpers implements the functions for data processing.

The layers.py implements all the main functions of capsule network, including KDE routing, Adaptive KDE routing, Primary Capsule layer and etc.

The network.py provides the wrapper of our model as well as baseline models for the comparison.

The utils.py provides all the evaluation functions such as Precision@1,3,5 and NDCG@1,3,5.

The EUR_Cap.py and EUR_eval.py are for training and inference, respectively.

Quick start

CUDA_VISIBLE_DEVICES=0 python EUR_eval.py

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python EUR_Cap.py

CUDA_VISIBLE_DEVICES=0 python EUR_Cap_grad.py # train CapNet on single GPU with accumulated gradients

Performance on EUR-Lex dataset

NLP-Capsule with Adaptive KDE routing:

Epoch: 20 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.33459
Tst Prec@1,3,5:  [0.7948253557567917, 0.65605864596808838, 0.53666235446312649]  
Tst NDCG@1,3,5:  [0.7948253557567917, 0.70826730037244034, 0.6843311797551882]

Epoch: 21 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.24704
Tst Prec@1,3,5:  [0.79301423027166884, 0.6552824493316064, 0.53666235446312793]  
Tst NDCG@1,3,5:  [0.79301423027166884, 0.70672871614554134, 0.68443643153244704]

Epoch: 22 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.24949
Tst Prec@1,3,5:  [0.79404915912031049, 0.65554118154376773, 0.53800776196636135] 
Tst NDCG@1,3,5:  [0.79404915912031049, 0.70816714976829975, 0.68780244631961929]

Epoch: 23 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.25533
Tst Prec@1,3,5:  [0.8046571798188874, 0.65890470030185422, 0.53604139715394228]  
Tst NDCG@1,3,5:  [0.8046571798188874, 0.71380071010660562, 0.69040247647419262]

Epoch: 24 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.26880
Tst Prec@1,3,5:  [0.80620957309184993, 0.65614489003880982, 0.53661060802069527]  
Tst NDCG@1,3,5:  [0.80620957309184993, 0.7133596479633022, 0.69571103238443532]

Epoch: 25 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.25847
Tst Prec@1,3,5:  [0.80155239327296246, 0.65329883570504454, 0.53448900388098108]  
Tst NDCG@1,3,5:  [0.80155239327296246, 0.7096033706441367, 0.69201706652281636]

Epoch: 26 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.26063
Tst Prec@1,3,5:  [0.80000000000000004, 0.65381630012936431, 0.53350582147477121]  
Tst NDCG@1,3,5:  [0.80000000000000004, 0.71043623399753963, 0.69499344732549306]

Epoch: 27 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.26004
Tst Prec@1,3,5:  [0.79689521345407499, 0.65398878827080587, 0.53376455368693132]  
Tst NDCG@1,3,5:  [0.79689521345407499, 0.71269493382033577, 0.69812854866301688]

Epoch: 28 Iteration: 120/121 (99.2%)  Loss: 0.00000 0.27287
Tst Prec@1,3,5:  [0.79818887451487708, 0.65588615782664883, 0.53500646830530163]  
Tst NDCG@1,3,5:  [0.79818887451487708, 0.71429911265714374, 0.70057615675866636]


XML-CNN:
Epoch: 31 Iteration: 45/46 (97.8%)  Loss: 0.00006 0.15460
Tst Prec@1,3,5:  [0.7583441138421734, 0.6164726175075479, 0.5073738680465716]  
Tst NDCG@1,3,5:  [0.7583441138421734, 0.6661232856458101, 0.644838787586548]

Epoch: 32 Iteration: 45/46 (97.8%)  Loss: 0.00005 0.15354
Tst Prec@1,3,5:  [0.759379042690815, 0.6143165157395448, 0.5062871927554978]  
Tst NDCG@1,3,5:  [0.759379042690815, 0.6648180435110952, 0.6434396675410785]

Epoch: 33 Iteration: 45/46 (97.8%)  Loss: 0.00005 0.15399
Tst Prec@1,3,5:  [0.757567917205692, 0.6169038378611481, 0.507373868046571]  
Tst NDCG@1,3,5:  [0.757567917205692, 0.666160785036582, 0.6440332351720106]

Epoch: 34 Iteration: 45/46 (97.8%)  Loss: 0.00004 0.15153
Tst Prec@1,3,5:  [0.7573091849935317, 0.616645105648988, 0.5099094437257432]  
Tst NDCG@1,3,5:  [0.7573091849935317, 0.6659194956789641, 0.6458294426678642]

Epoch: 35 Iteration: 45/46 (97.8%)  Loss: 0.00005 0.15212
Tst Prec@1,3,5:  [0.7552393272962484, 0.6153514445881856, 0.5092367399741262]  
Tst NDCG@1,3,5:  [0.7552393272962484, 0.6648419426927356, 0.6453632713906606]

Epoch: 36 Iteration: 45/46 (97.8%)  Loss: 0.00004 0.15231
Tst Prec@1,3,5:  [0.7596377749029755, 0.6157826649417857, 0.5093402328589907]  
Tst NDCG@1,3,5:  [0.7596377749029755, 0.6661452963066051, 0.646133349811576]

Epoch: 37 Iteration: 45/46 (97.8%)  Loss: 0.00006 0.15357
Tst Prec@1,3,5:  [0.7570504527813713, 0.6175937904269097, 0.5088227684346699]  
Tst NDCG@1,3,5:  [0.7570504527813713, 0.6670823259018512, 0.6455866525334287]

Epoch: 38 Iteration: 45/46 (97.8%)  Loss: 0.00006 0.16400
Tst Prec@1,3,5:  [0.7583441138421734, 0.6162138852953867, 0.5085122897800777]  
Tst NDCG@1,3,5:  [0.7583441138421734, 0.6658377730303046, 0.6448260229129755]

Epoch: 39 Iteration: 45/46 (97.8%)  Loss: 0.00004 0.15555
Tst Prec@1,3,5:  [0.7578266494178525, 0.6173350582147488, 0.509029754204398]  
Tst NDCG@1,3,5:  [0.7578266494178525, 0.6667396690496684, 0.645590263852396]

Epoch: 40 Iteration: 45/46 (97.8%)  Loss: 0.00004 0.15414
Tst Prec@1,3,5:  [0.7565329883570504, 0.61811125485123, 0.5087192755498058]  
Tst NDCG@1,3,5:  [0.7565329883570504, 0.6674559324640292, 0.6452839523583206]

Reference

If you find our source code useful, please consider citing our work.

@inproceedings{zhao2019capsule,
    title = "Towards Scalable and Reliable Capsule Networks for Challenging {NLP} Applications",
    author = "Zhao, Wei and Peng, Haiyun and Eger, Steffen and Cambria, Erik and Yang, Min",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P19-1150",
    doi = "10.18653/v1/P19-1150",
    pages = "1549--1559"
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
EUR_Cap.py		EUR_Cap.py
EUR_Cap_grad.py		EUR_Cap_grad.py
EUR_eval.py		EUR_eval.py
README.md		README.md
data_helpers.py		data_helpers.py
layer.py		layer.py
network.py		network.py
utils.py		utils.py
w2v.py		w2v.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EUR_Cap.py

EUR_Cap.py

EUR_Cap_grad.py

EUR_Cap_grad.py

EUR_eval.py

EUR_eval.py

README.md

README.md

data_helpers.py

data_helpers.py

layer.py

layer.py

network.py

network.py

utils.py

utils.py

w2v.py

w2v.py

Repository files navigation

Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications

Preparation

Code Explanation

Quick start

Performance on EUR-Lex dataset

Reference

About

Releases

Packages

Contributors 2

Languages

andyweizhao/NLP-Capsule

Folders and files

Latest commit

History

Repository files navigation

Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications

Preparation

Code Explanation

Quick start

Performance on EUR-Lex dataset

Reference

About

Resources

Stars

Watchers

Forks

Languages