Skip to content

sheng-z/figet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds

Source code and data for StarSem'18 paper Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds.

logo

Citation

The source code and data in this repository aims at facilitating the study of fine-grained entity typing. If you use the code/data, please cite it as follows:

@InProceedings{zhang-EtAl:2018:starSEM,
  author    = {Zhang, Sheng  and  Duh, Kevin  and  {Van Durme}, Benjamin},
  title     = {{Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds}},
  booktitle = {Proceedings of the 7th Joint Conference on Lexical and Computational Semantics (*SEM 2018)},
  month     = {June},
  year      = {2018}
}

Benchmark Performance

1. OntoNotes (Gillick et al., 2014)

Approach Strict F1 Macro F1 Micro F1
Our Approach 55.52 73.33 67.61
    w/o Adaptive thresholds 53.49 73.11 66.78
    w/o Document-level contexts 53.17 72.14 66.51
Approach Strict F1 Macro F1 Micro F1
Our Approach 60.23 78.67 75.52
    w/o Adaptive thresholds 60.05 78.50 75.39
Approach Strict F1 Macro F1 Micro F1
Our Approach 60.87 77.75 76.94
    w/o Adaptive thresholds 58.47 75.84 75.03
    w/o Document-level contexts 58.12 75.65 75.11

Prerequisites

  • Python 2.7
  • PyTorch 0.2.0 (w/ CUDA support)
  • Numpy
  • tqdm

Running

Once getting the prerequisites, you can run the whole process very easily. Take the OntoNotes corpus for example,

Step 1: Download the data

./scripts/ontonotes.sh get_data

Step 2: Preprocess the data

./scripts/ontonotes.sh preprocess

Step 3: Train the model

./scripts/ontonotes.sh train

Step 4: Tune the threshold

./scripts/ontonotes.sh adaptive-thres

Step 5: Do inference

./scripts/ontonotes.sh inference

Acknowledgements

The datasets (Wiki and OntoNotes) are copies from Sonse Shimaoka's repository.

License

MIT