Skip to content

yoseflaw/nerindo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nerindo

Named Entity Recognition for Bahasa Indonesia NER with PyTorch.

Corpus for NER:

The step-by-step implementation in Google Colab is indexed here.

The Fine-tuned Indonesian word embeddings id_ft.bin is available here, based on word embeddings trained in indonesian-word-embedding.

Included configurations

  1. BiLSTM
  2. BiLSTM + Word Embeddings
  3. BiLSTM + Word Embeddings + Char Embeddings (CNN)
  4. BiLSTM + Word Embeddings + Char Embeddings (CNN) + Attention Layer
  5. Transformer (simplified BERT) + Word Embeddings + Char Embeddings (CNN)

Learning rate finder

Automatic learning rate finder based on pytorch-lr-finder.

Note: since the learning rates are determined automatically from the same range for all models, it may not be the best learning rate. To see the best learning rate, check the google colab version.

Example output:

LR Finder Example Output

Final result

LR Finder Example Output

Main reference

Gunawan, W., Suhartono, D., Purnomo, F., & Ongko, A. (2018). Named-entity recognition for indonesian language using bidirectional lstm-cnns. Procedia Computer Science, 135, 425-432.

About

Named Entity Recognition with BiLSTM, CRF, and Attention-based models implemented in PyTorch for Indonesian News.

Topics

Resources

Stars

Watchers

Forks

Languages